I am new to MATLAB's environment and no matter how much I have struggled it just seems that I cannot get the concept of how to construct a ML algorithm for a multivariate Bernoulli.
I have a dataset of N variables (x1,x2,...,xN) and each variable is a vector of D dimensions (Dx1), with a parameter vector in the form p=(p1,p2,...,pD) . So the Bernoulli distribution should have the form:
Pr(X|p)=Πp(d)^x(nd)*(1-p(d))^(1-x(nd))
The code that I created uses MATLAB's mle function:
for n=1:D
prob(n)=mle(dataset(:,n),'distribution', 'bernoulli');
end
which gives me a D vector of estimated probabilities from the dataset.
But, what I am really interested in is how to implement the ML on a step-by-step MATLAB process and not just use the mle.
Thank you very much.
phat for a Bernoulli distribution is proportion of successes to the number of trials. If you'd like to do it manually, you can just count the number of successes (either 1 or 0) in each of your vectors then divide it by the length of the vector. Here's a quick way assuming 1's are successes stored vertically in the matrix.
bern_mat = [0 0 1 0 1 1; 1 1 0 1 0 0 ; 1 0 1 0 1 1]; % 3x6 matrix of 1's and 0's
phat = sum(bern_mat,1)/size(bern_mat,1); % sum across the first dim then divide by size of first dim.
Related
I am simulating a mini AES encryption/decryption algorithm using MATLAB. For this I need to multiply two 4-bit numbers while treating them as polynomials. It goes though some stages, that are, converting to polynomials, multiply the two polynomials, polynomial reduction to lower power if needed using a predefined irreducible polynomial. Then converting back to 4-bit format.
For instance, multiplying 1011⊗ 0111 is analogous to x3+x+1 ⊗ x2+x+1 The ans is x5+x4+1 has of a power of 5 then you need to reduce it by dividing on the predefined polynomial x4+x+1. The answer will be x2 that is 0100.
I know that there are some functions in MATLAB doing polynomial multiplications but they are kind of general and need some specific function or method to do this.
Many thanks in advance!
Polynomial multiplication/division is the same as convolution/deconvolution of their coefficients. Then mod(...,2) is applied to the results.
I'm not quite sure that this two-step process is correct for GF; please try with some other polynomials have to see if the results are what you expect:
x = [1 0 1 1];
y = [0 1 1 1];
product = conv(x, y);
product = mod(product ,2);
divider = [1 0 0 1 1];
[~, remainder] = deconv(product, divider);
remainder = mod(remainder, 2);
This gives
product =
0 1 1 0 0 0 1
remainder =
0 0 0 0 1 0 0
I'm kind've new to Matlab and stack overflow to begin with, so if I do something wrong outside of the guidelines, please don't hesitate to point it out. Thanks!
I have been trying to do convolution between two functions and I have been having a hard time trying to get it to work.
t=0:.01:10;
h=exp(-t);
x=zeros(size(t)); % When I used length(t), I would get an error that says in conv(), A and B must be vectors.
x(1)=2;
x(4)=5;
y=conv(h,x);
figure; subplot(3,1,1);plot(t,x); % The discrete function would not show (at x=1 and x=4)
subplot(3,1,2);plot(t,h);
subplot(3,1,3);plot(t,y(1:length(t))); %Nothing is plotted here when ran
I commented my issues with the code. I don't understand the difference of length and size in this case and how it would make a difference.
For the second comment, x=1 should have an amplitude of 2. While x=4 should have an amplitude of 5. When plotted, it only shows nothing in the locations specified but looks jumbled up at x=0. I'm assuming that's the reason why the convoluted plot won't be displayed.
The original problem statement is given if it helps to understand what I was thinking throughout.
Consider an input signal x(t) that consists of two delta functions at t = 1 and t = 4 with amplitudes A1 = 5 and A2 = 2, respectively, to a linear system with impulse response h that is an exponential pulse (h(t) = e ^−t ). Plot x(t), h(t) and the output of the linear system y(t) for t in the range of 0 to 10 using increments of 0.01. Use the MATLAB built-in function conv.
The initial question regarding size vs length
length yields a scalar that is equal to the largest dimension of the input. In the case of your array, the size is 1 x N, so length yields N.
size(t)
% 1 1001
length(t)
% 1001
If you pass a scalar (N) to ones, zeros, or a similar function, it will create a square matrix that is N x N. This results in the error that you see when using conv since conv does not accept matrix inputs.
size(ones(length(t)))
% 1001 1001
When you pass a vector to ones or zeros, the output will be that size so since size returns a vector (as shown above), the output is the same size (and a vector) so conv does not have any issues
size(ones(size(t)))
% 1 1001
If you want a vector, you need to explicitly specify the number of rows and columns. Also, in my opinion, it's better to use numel to the number of elements in a vector as it's less ambiguous than length
z = zeros(1, numel(t));
The second question regarding the convolution output:
First of all, the impulses that you create are at the first and fourth index of x and not at the locations where t = 1 and t = 4. Since you create t using a spacing of 0.01, t(1) actually corresponds to t = 0 and t(4) corresponds to t = 0.03
You instead want to use the value of t to specify where to put your impulses
x(t == 1) = 2;
x(t == 4) = 5;
Note that due to floating point errors, you may not have exactly t == 1 and t == 4 so you can use a small epsilon instead
x(abs(t - 1) < eps) = 2;
x(abs(t - 4) < eps) = 5;
Once we make this change, we get the expected scaled and shifted versions of the input function.
I want calculate area under receiver operating characteristic curve in a loop. My loop using some kind of cross-validation. In some iterations my code suddenly stops and return this error for perfcurve function :
Less than two classes are found in the array of true class labels.
When I check the inputs of curve, I have for instance:
labels=
1 1 1 1 1 1 1 1 1 1 1 1
scores=
1 0 0 1 1 0 1 0 0 0 1 1
The function I'm using is labels(labels,scores,'1'). As you know for computing ROC we need 'true positive rate' and 'false positive rate'. We have these two values in my above example! Why this function can't calculate ROC?
It can't calculate the AUC because there is no 'false positive rate'. The definition of true positive (TP) and false positive (FP):
TP: 1s which are (correctly) 1s.
FP: 0s which are (incorrectly) 1s.
Basically, if your lables are all 0s or 1s you won't get both TP and FP.
Are you sure? Are you using function 'labels' as you mentioned? :)
perfcurve:
[X,Y] = perfcurve(labels,scores,posclass) computes a ROC curve for a vector of classifier predictions scores given true class labels, labels.
labels can be a numeric vector, logical vector, character matrix, cell array of strings or categorical vector.
scores is a numeric vector of scores returned by a classifier for some data.
posclass is the positive class label (scalar), either numeric (for numeric labels), logical (for logical labels), or char.
Definition:
A(i, j) = 1 is a midpoint of a cross if the elements
A(i-1, j) = 1
A(i+1, j) = 1
A(i, j+1) = 1
A(i, j-1) = 1.
Together the elements and the midpoint form a cross in a matrix A, where A is at least a 3-by-3 matrix and i, j ∈ ℕ\{0}.
Suppose the image above is the 8-by-8 matrix A with natural numbers 1, 2, 3 ... as elements. From this definition the matrix has a total of 3 crosses. The crosses have their midpoints on A(2,2), A(5, 4) and A(5, 5).
What I want to do is write a function that finds the number of crosses in the matrix A. I have an idea but I'm not sure it's the most optimal one. Here's the pseudocode for it:
ITERATE FROM row 2 TO row 7
ITERATE FROM column 1 TO column 8
IF current element contains 1
INCREMENT xcount by 1
IF xcount >= 3
CHECK IF counted 1:s is part of a cross
ELSE IF xcount IS NOT 0
SET xcount to 0
The idea is to iterate through every column from row 2 to row 7. If I find 3 consecutive 1:s on the same row I immediately check if the 1:s belongs to a cross. This should work, but imagine having a very large matrix A - how efficient would this code be in that situation? Couldn't this problem be solved using vector notation?
Any answer is very much appreciated. Thanks in advance!
Not near matlab at the moment, but this is what I'd do. Assuming A is binary (has only 0'a and 1's):
crs=[0 1 0 ; 1 1 1 ; 0 1 0]; % a minimal "cross" filter
C=conv2(A,crs./sum(crs(:)),'same'); % convolve A with it
[x y]=find(C>0.9); % find x,y positions of the crosses by looking
% for peak values of C
so you basically convolve with a "minimal" (normalized) cross (crs) and look for peaks using max. x and y are the coordinates of your cross positions. No need to use for loops, just the built in (and pretty fast) 2d convolution, and the max function.
The threshold condition C>0.9, is just to illustrate that there's need to be a threshold that is weighted by intensity of crs. In this case I have normalized crs in the colvolution line (crs/sum(crs(:))) so if A is a binary matrix as in the example, you find that the convolution of the minimal normalized cross will leave the value of the pixel where the cross is at 1, whereas other pixels will be less than 1 (that's why I arbitrarily chose 0.9) . So you can replace the threshold to C==1, if it's always a binary.
Another way to visulize the position of the cross is just to look at C.*(C==1). This will generate a matrix the size of A with 1s only where the crosses were...
EDIT:
For maximal speed, you may consider writing it as a one liner, for example:
[x y]=find(conv2(A,[0 1 0 ; 1 1 1 ; 0 1 0]./5,'same')==1);
Using bit masks:
ux = [false(size(A,1),1) (A(:,3:end) & A(:,2:end-1) & A(:,1:end-2)) false(size(A,1),1)]
uy = [false(1,size(A,2)); (A(3:end,:) & A(2:end-1,:) & A(1:end-2,:)); false(1, size(A,2))]
u = ux & uy
[x y] = find(u)
I am trying to solve the following optimization problem in octave
The first contraint is that A be positive semi-definite.
S is a set of data points such that if (xi,xj) is in S then xi is similar to xj and D is a set of data points such that if (xi,xj) is in D then xi and xj are dissimilar. Note that the above formula is 2 separate sums and the second sum is not nested. Also xi and xj are assumed to be column vectors of length N.
Because this is a nonlinear optimization I am trying to use octave's nonlinear program solver, sqp.
The problem is that if I just provide it with the function to optimize, on some small toy tests the, BFGS method to find the Hessian
fails. Because of this I tried to provide my own Hessian function but now this problem occurs
error: __qp__: operator *: nonconformant arguments (op1 is 2x2, op2 is 3x1)
error: called from:
error: /usr/share/octave/3.6.3/m/optimization/qp.m at line 393, column 26
error: /usr/share/octave/3.6.3/m/optimization/sqp.m at line 414, column 32
when I make the following call to sqp
[A, ~, Info] = sqp(initial_guess, {#toOpt, #CalculateGradient,#CalculateHessian},
[],[],0,[],maxiter);
I simplified the constraint that A be positive semi-definite and diagonal by only solving for the diagonal entries and constraining all the diagonal entries to be >=0. initial_guess is a vector of ones that is N long.
Here is my code to calculate what I believe to be the Hessian matrix
%Hessian = CalculateHessian(A)
%calculates the Hessian of the function we are optimizing as follows
%H(i,j) = (sumsq(D(:,i),1) * sumsq(D(:,j),1)) / (sum(A.*sumsq(D,1))^2)
%where D is a matrix of of differences between observations that are dissimilar, with one difference on each row
%and sumsq is the sum of the squares
%input A: the current guess for A
%output Hessian: The hessian of the function we are optimizing
function Hessian = CalculateHessian(A)
global HessianNumerator; %this is a matrix with the numerator of H(i,j)
global Dsum_of_squares; %the sum of the squares of the differences of each dimensions of the dissimilar observations
if(iscolumn(A)) %if A is a column vector
A = A'; %make it a row vector. necessary to prevent broadcasting
endif
if(~isempty(Dsum_of_squares)) %if disimilar constraints were provided
Hessian = HessianNumerator / (sum(A.*Dsum_of_squares)^2)
else
Hessian = HessianNumerator; %the hessian is a matrix of 0s
endif
endfunction
and Dsum_of_squares and HessianNumertor are
[dissimilarRow,dissimilarColumn] = find(D); %find which observations are dissimilar to each other
DissimilarDiffs = X(dissimilarRow,:) - X(dissimilarColumn,:); %take the difference between the dissimilar observations
Dsum_of_squares = sumsq(DissimilarDiffs,1);
HessianNumerator = Dsum_of_squares .* Dsum_of_squares'; %calculate the numerator of the Hessian. it is a constant value
X is a M x N matrix with one observation per row.
D is a M x M dissimilarity matrix. if D(i,j) is 1 then row i of X is dissimlar to row j. 0 otherwise.
I believe my error is in one of the following areas (from least likely to most likely)
The math I used to derive the Hessian function is wrong. The formula I am using is in my comments for the function.
My implementation of the math.
The Hessian Matrix that sqp wants is different from the one described on the Hessian Matrix Wikipedia page.
Any help would be greatly appreciated. If you need me to post more code I would be happy to do so. Right now the amount of code to try and solve the optimization is about 160 lines.
Here is the test case I am running that causes the code to fail. It works if I only pass it the gradient function.
X = [1 2 3;
4 5 6;
7 8 9;
10 11 12];
S = [0 1 1 0;
1 0 0 0;
1 0 0 0;
0 0 0 0]; %this means row 1 of X is similar to rows 2 and 3
D = [0 0 0 0;
0 0 0 0;
0 0 0 1;
0 0 1 0]; %this means row 3 of X is dissimilar to row 4
gml(X,S,D, 200); %200 is the maximum number of iterations for sqp to run