How could one, in an unsupervised environment, get a feature vectors for a sentence. I believe for an image data one could build a conv auto encoder and take the hidden layers outputs. What would be the best way to do this for RNN type models (LSTM, GRU, etc.).
Would this (https://papers.nips.cc/paper/5271-pre-training-of-recurrent-neural-networks-via-linear-autoencoders.pdf) be on the right track?
The easiest thing to do would be doing something with the word2vec representations of all the words (like summing them?).
See this post:
How to get vector for a sentence from the word2vec of tokens in sentence
Related
I am using a GloVe based pre-trained embedded vectors for words in my I/P sentences to a NMT-like model. The model then generates a series of word embeddings as its output for each sentence.
How can I convert these output word embeddings to respective words? One way I tried is using cosine similarity between each output embedding vector and all the i/p embedding vectors. Is there a better way than this?
Also, is there a better way to approach this than using embedding vectors?
First of all the question is lacking a lot of details like the library used for word embedding, the nature of the model, and the training data, etc ...
But I will try to give you an idea what you can do in those situations, assuming you are using some word embedding library like Gensim.
How to get the word from the vector:
We are dealing with predicted word vectors here, so our word vector may not be the exact vector of the original word, we have to use similarity, in Gensim you can use similar_by_vector, something like
target_word_candidates = similar_by_vector(target_word_vector,top=3)
That's would solve the reverse lookup problem, as highlighted here, given all the word vectors how to get the most similar word, but we need to find the best single word according to the context.
You can use some sort of post-processing on the output target word vectors, this would be beneficial for trying to solve some problems like:
1.How to guide the translation of out-of-vocabulary
terms?
2.How to enforce the presence of a
given translation recommendation in the decoder’s
output?
3.How to place these word(s) in the right
position?
One of the ideas is to use an external resource for the target language, i.e. language model, to predict which combination of words are gonna be used. Some other techniques incorporate the external knowledge inside the translation network itself
I am trying to cluster a dataset using an encoder and since I am new in this field I cant tell how to do it.My main issue is how to define the loss function since the dataset is unlabeled and up to know, what I have seen from bibliography they define as loss function the distance between the desired output and the predicted output.My question is since that I dont have a desired output how should I implement this?
You can use an auto encoder to pre-train your convolutional layers, like it described in my question here with usage of convolutional autoencoder for images
As you can see form code, loss function is Adam with metrics accuracy and dice coefficient, I think you can use accuracy only, since dice coefficient is image-specific
I’m not sure how it will work for you, because you hadn’t provided your idea how you will transform your bibliography lists to vector, perhaps you will create a list for bibliography id’s sorted by the cosine distance between them
For example, you can use a set of vector with cosine distances to each item in a bibliography list above for each reference in your dataset and use it as input for autoencoder
After encoder will be trained, you can remove the decoder part from your model output and use as an input for one of unsupervised clustering algorithms, for example, k-mean. You can find details about them here
I loaded a word2vec model using Google News dataset. Now I want to get the Word2Vec representations of a list of sentences that I wish to cluster. After going through the documentation I found this gensim.models.word2vec.LineSentencebut I'm not sure this is what I am looking for.
There should be a way to get word2vec representations of a list of sentences from a pretrained model right? None of the links I searched had anything about it. Any leads would be appreciated.
Word2Vec only offers vector representations for words, not sentences.
One crude but somewhat effective (for some purposes) way to go from word-vectors to vectors for longer texts (like sentences) is to average all the word-vectors together. This isn't a function of the gensim Word2Vec class; you have to code this yourself.
For example, with the word-vectors already loaded as word_model, you'd roughly do:
import numpy as np
sentence_tokens = "I do not like green eggs and ham".split()
sum_vector = np.zeros(word_model.vector_size)
for token in sentence_tokens:
sum_vector += word_model[token]
sentence_vector = sum_vector / len(sentence_tokens)
Real code might add handling for when the tokens aren't all known to the model, or other ways of tokenizing/filtering the text, and so forth.
There are other more sophisticated ways to get the vector for a length-of-text, such as the 'Paragraph Vectors' algorithm implemented by gensim's Doc2Vec class. These don't necessarily start with pretrained word-vectors, but can be trained on your own corpus of texts.
So, I am using word2vec in Java, and trying to train it somehow so that it gives me vector representation for words and sentences.
Can I use this for feeding input into a neural network, to get a response on the basis of the word2vec data? I am planning to make a chatbot with the help of this.
Adding on to #galloguille's comments, you can use pre-trained word2vec's word-vectors to initialize your RNN. RNN can learn from sequence of words to predict next word(s). A good example with code for this, you can find here - https://github.com/larspars/word-rnn.
There is good collection of current state of the art on chatbots here - https://stanfy.com/blog/the-rise-of-chat-bots-useful-links-articles-libraries-and-platforms/
From my understanding, most effective chatbots don't use RNN directly (at present) to reply to a question, but try to predict intent (from a fixed set of intents) of the question in the first step. Based on each intent, they calculate some actionable insights and a logical reply to the question.
I've recently been delving into artificial neural networks again, both evolved and trained. I had a question regarding what methods, if any, to solve for inputs that would result in a target output set. Is there a name for this? Everything I try to look for leads me to backpropagation which isn't necessarily what I need. In my search, the closest thing I've come to expressing my question is
Is it possible to run a neural network in reverse?
Which told me that there, indeed, would be many solutions for networks that had varying numbers of nodes for the layers and they would not be trivial to solve for. I had the idea of just marching toward an ideal set of inputs using the weights that have been established during learning. Does anyone else have experience doing something like this?
In order to elaborate:
Say you have a network with 401 input nodes which represents a 20x20 grayscale image and a bias, two hidden layers consisting of 100+25 nodes, as well as 6 output nodes representing a classification (symbols, roman numerals, etc).
After training a neural network so that it can classify with an acceptable error, I would like to run the network backwards. This would mean I would input a classification in the output that I would like to see, and the network would imagine a set of inputs that would result in the expected output. So for the roman numeral example, this could mean that I would request it to run the net in reverse for the symbol 'X' and it would generate an image that would resemble what the net thought an 'X' looked like. In this way, I could get a good idea of the features it learned to separate the classifications. I feel as it would be very beneficial in understanding how ANNs function and learn in the grand scheme of things.
For a simple feed-forward fully connected NN, it is possible to project hidden unit activation into pixel space by taking inverse of activation function (for example Logit for sigmoid units), dividing it by sum of incoming weights and then multiplying that value by weight of each pixel. That will give visualization of average pattern, recognized by this hidden unit. Summing up these patterns for each hidden unit will result in average pattern, that corresponds to this particular set of hidden unit activities.Same procedure can be in principle be applied to to project output activations into hidden unit activity patterns.
This is indeed useful for analyzing what features NN learned in image recognition. For more complex methods you can take a look at this paper (besides everything it contains examples of patterns that NN can learn).
You can not exactly run NN in reverse, because it does not remember all information from source image - only patterns that it learned to detect. So network cannot "imagine a set inputs". However, it possible to sample probability distribution (taking weight as probability of activation of each pixel) and produce a set of patterns that can be recognized by particular neuron.
I know that you can, and I am working on a solution now. I have some code on my github here for imagining the inputs of a neural network that classifies the handwritten digits of the MNIST dataset, but I don't think it is entirely correct. Right now, I simply take a trained network and my desired output and multiply backwards by the learned weights at each layer until I have a value for inputs. This is skipping over the activation function and may have some other errors, but I am getting pretty reasonable images out of it. For example, this is the result of the trained network imagining a 3: number 3
Yes, you can run a probabilistic NN in reverse to get it to 'imagine' inputs that would match an output it's been trained to categorise.
I highly recommend Geoffrey Hinton's coursera course on NN's here:
https://www.coursera.org/course/neuralnets
He demonstrates in his introductory video a NN imagining various "2"s that it would recognise having been trained to identify the numerals 0 through 9. It's very impressive!
I think it's basically doing exactly what you're looking to do.
Gruff