Using Merge Layer in Keras using Dot Product - merge

I am trying to merge two layers together. My input, or my processed data, appears as such:
[[2069 2297 3087 ..., 0 0 0]
[2069 2297 3087 ..., 0 0 0]
[2069 2297 3087 ..., 0 0 0]
...,
[2711 4215 875 ..., 0 0 0]
[5324 1412 1301 ..., 0 0 0]
[5065 3561 5002 ..., 0 0 0]]
With each row representing a sequence of words and each #, a specific index to a word. I have two of these data and I am trying to merge them together by first embedding them into 16-dimensional word vectors and then using a dot product. To do this, I created two branches to embed the data first. I then try to merge them.
When I try to merge the two using this function in Keras:
model = Sequential()
model.add(Merge( [x1_branch, x2_branch], mode = 'dot'))
I get the following error:
ValueError: Error when checking target: expected merge_1 to have 3 dimensions, but got array with shape (162, 1)
I believe that the matrix multiplication was executed, as written and described in the documentation:
"E.g. if applied to two tensors a and b of shape (batch_size, n), the output will be a tensor of shape (batch_size, 1) where each entry i will be the dot product between a[i] and b[i]."
Obviously, my batch size for this sample is 162. However, the error still makes no sense. How can the merge layer expect an input if it has already, seemingly, done the calculation?
I would greatly appreciate any help. Thanks!

Related

I'm trying to build a Neural Network in python

I'm trying to build a neural network and am following a tutorial.
What do those two lines mean?
syn0 = 2*np.random.random((3,4)) -1
syn1 = 2*np.random.random((4,1)) -1
Specifically, Those values (3,4 | 4,1)
I just don't get it...
I think I know what the first synapse's values mean, but not the 2nd one...
np.random.random creates an array of random values between 0 and 1 - the parameter it takes is the desired shape of the array, which is what (3,4) and (4,1) are in your example.
Simple random weight initialisation is sufficient to train your neural network, but initialising them with a mean of 0 speeds up training, which is what 2*np.random.random((3,4)) -1 does:
np.random.random((3,4)) // array with values in range [0, 1) and mean of 0.5
2 * np.random.random((3,4)) // array with values in range [0, 2) and mean of 1
2 * np.random.random((3,4)) - 1 // array with values in range [-1, 1) and mean of 0

Matlab: Covariance Matrix from matrix of combinations using E(X) and E(X^2)

I have a set of independent binary random variables (say A,B,C) which take a positive value with some probability and zero otherwise, for which I have generated a matrix of 0s and 1s of all possible combinations of these variables with at least a 1 i.e.
A B C
1 0 0
0 1 0
0 0 1
1 1 0
etc.
I know the values and probabilities of A,B,C so I can calculate E(X) and E(X^2) for each. I want to treat each combination in the above matrix as a new random variable equal to the product of the random variables which are present in that combination (show a 1 in the matrix). For example, random variable Row4 = A*B.
I have created a matrix of the same size to the above, which shows the relevant E(X)s instead of the 1s, and 1s instead of the 0s. This allows me to easily calculate the vector of Expected values of the new random variables (one per combination) as the product of each row. I have also generated a similar matrix which shows E(X^2) instead of E(X), and another one which shows prob(X>0) instead of E(X).
I'm looking for a Matlab script that computes the Covariance matrix of these new variables i.e. taking each row as a random variable. I presume it will have to use the formula:
Cov(X,Y)=E(XY)-E(X)E(Y)
For example, for rows (1 1 0) and (1 0 1):
Cov(X,Y)=E[(AB)(AC)]-E(X)E(Y)
=E[(A^2)BC]-E(X)E(Y)
=E(A^2)E(B)E(C)-E(X)E(Y)
These values I already have from the matrices I've mentioned above. For each Covariance, I'm just unsure how to know which two variables appear in both rows, because for those I will have to select E(X^2) instead of E(X).
Alternatively, the above can be written as:
Cov(X,Y)=E(X)E(Y)*[1/prob(A>0)-1]
But the problem remains as the probabilities in the denominator will only be the ones of the variables which are shared between two combinations.
Any advice on how automate the computation of the Covariance matrix in Matlab would be greatly appreciated.
I'm pretty sure this is not the most efficient way to do that but that's a start:
Assume r1...n the combinations of the random variables, R is the matrix:
A B C
r1 1 0 0
r2 0 1 0
r3 0 0 1
r4 1 1 0
If you have the vector E1, E2 and ER as:
E1 = [E(A) E(B) E(C) ...]
E2 = [E(A²) E(B²) E(C²) ...]
ER = [E(r1) E(r2) E(r3) ...]
If you want to compute E(r1,r2) you can:
1) Extract the R1 and R2 columns from R
v1 = R(1,:)
v2 = R(2,:)
2) Sum both vectors in vs
vs = v1 + v2
3) Loop in vs, if you see a 2 that means the value in R2 has to be used, if you see a 1 it is the value in R1, if it is 0 do not use the value.
4) Using the loop, compute your E(r1,r2) as wanted.

Neural Networks for integer values

I have approximately 5000 integer vectors (=SIZE) that look like:
[1 0 4 2 0 1 3 ...]
They have the same length N=32 and their values ranges from 0 to 4 but let's say [0 MAX].
I created a NN that takes vectors as inputs and outputs a binary array corresponding to one of the desired output(number of possible outputs = M):
for instance [0 1 0 0 ...0] => 2nd output. array_length = M
I used a Multi Layer Perceptron in Neuroph with those integer values but it did not converge.
So I am guessing the problem is using integer values or using a MLP with 3 layers: input, hidden and output.
Can you advise me on the network structure? which type of NN is suitable? Should I remodel the input and output to simplify the learning process? I have been thinking about Gray encoding for the integers input.

Replace zero values in vector

Ive got a vector like this
a=[0 5 3 0 1]
and a corresponding vector, containing the same amount of numbers as there are zeros in the first vector
b=[2 4]
what I want to get is
x=[2 5 3 4 1]
I tried fiddling around with, and somewhat got the feeling that the find / full methods might help me here, but didn't get it to work
c=(a==0)
>[1 0 0 1 0]
Thank you!
It is as easy as this:
x=a;
As x==0 gives the vector of all the locations an element = 0, ie [0 1 0 0 1], x(x==0) is indexing x to get the actual elements of x that are equal to 0, which you can then assign values as if it were any other vector/matrix (where the values we are not interested in do not exist, and are not indexed), using the following:
x(x==0)=b;

Training a Decision Tree in MATLAB over binary train data

I want to train a decision tree in MATLAB for binary data. Here is a sample of data I use.
traindata <87*239> [array of data with 239 features]
1 0 1 0 0 0 1 1 0 0 1 0 1 0 1 1 1 1 1 0 0 0 1 1 0 ... [till 239]
1 1 1 0 0 0 1 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 1 0 1 ... [till 239]
....
The thing is that this data corresponds to a form which has only options for yes/no. The outcome of the form is also binary and has the meaning that a patinet has some medical disorder or not! we have used classification tree and the classifier shows us double numbers. for example it branches the first node based on x137 value being bigger than 0.75 or not! Since we don't have 0.75 in our data and it has no yes/no meaning we wanted to use a decision tree which is best for our work. The best decision tree for us is the one that is trained based on boolean variables not double ones. Also it understands that the data is not continuous and for example instead of above representation shows x137 is yes o no (1 or 0). Can someone help me with this? I would also appreciate a solution to map our data to double variables and features if the boolean decision tree is not appliable. I am currently using classregtree in matlab with <87*237> as train and <87*1> as results.
classregtree has an optional input parameter categorical. Using this option, you can pass in a vector indicating which of your input variables are categorical (in your case, this vector would be 1x239, all ones). The decision tree should then contain yes/no decisions rather than numerical thresholds.
From the help of classregtree:
t = classregtree(X,y) creates a decision tree t for predicting the response y as a function of the predictors in the columns of X. X is an n-by-m matrix of predictor values. If y is a vector of n response values, classregtree performs regression. If y is a categorical variable, character array, or cell array of strings, classregtree performs classification.
What's the type of y in your case? It seems that classregtree is doing regression in your case but you want classification. So, y should be a categorical variable.
EDIT: To make your y categorical, you can try "nominal(y)".