NuPIC on MNIST Dataset - classification

I am a newbie. I think idea of NuPIC is really cool and therefore wanted to apply KNN Classifier on
NuPIC's output. I saw there is a KNNClassifier object already in python. I am confused about the input
patter that I should use. In case of MNIST dataset I will be having images where each image is a 2D
array of numbers and will be sparse. I can understand the format of output can be encoded using
categorical encoder in NuPIC but there is no such example of encoding an input that comes in the
form of arrays.
Any help will be highly appreciated.

This might help: http://numenta.org/search.html?q=mnist. There are some good discussions on our mailing lists about MNIST.

Related

Deeplearning4j Autoencoder

I couldn't find any full example of an autoencoder in DL4J documentation. I see a good general description of Autoencoders here with a small piece of code for just the MultiLayerConfiguration, but the code is not full. Is there any full example where a dataset is loaded, pre-processed and then inserted into the network and a prediction is generated? For example, an example working with the Movielens dataset, or any other. Thank you.
You have an example of a deep auto encoder using the mnist dataset here:
https://deeplearning4j.konduit.ai/deeplearning4j/reference/auto-encoders
With code here:
https://github.com/eclipse/deeplearning4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/quickstart/modeling/feedforward/unsupervised/MNISTAutoencoder.java

HTML Embeddings into Neural Networks?

I'm beginning my journey into Neural Networks and trying to understand both character and word embeddings as ways to input text data into a NN. Specifically, I am trying to embed HTML tag information. I tried googling some different combinations of my problem and came up empty.
My current understanding is that embeddings "embed" words or characters into an N-Dimensional space, which allows NNs to be able to understand them as inputs. So in this case, something like word2vec would not necessarily help me because it is not meant to understand the "meaning" of HTML elements? So thus a character embedding would be better?
If anyone could point me in a direction that would be awesome, as I am having trouble finding this on my own.
Thanks in advance.

Non-image data with cnn [Matlab Specific]

I am trying to use a cnn to build a classifier for my data.
The training set is comprised of 2D numerical matrices which are not image data.
It seems that Matlab's cnns only work with image inputs:
https://uk.mathworks.com/help/nnet/ref/imageinputlayer-class.html
Does anyone have experience with cnns and non-image data using Matlab's deep learning toolbox?
Thank you.
Well I first would like to understand why you want to use a CNN with non-image data? CNNs are specially good because they take into account information in the neighborhood. Unless your data has some kind of region pattern (like pixels that get together to create a pattern or sentences where word order is relevant) the CNN would not be the best approach to handle it.
That been said, if you still want to use it you could convert the matrix to images. I'm not sure if that would help though.
Function to convert: mat2gray

can we use autoencoders for text data

I am doing my project based on health care.I am going to train my autoencoders with the symptoms and the diseases i.e my input is in textual form. Will that work? (I am using Rstudio).Please anyone help me with this
You have to convert the text to vectors/numbers. To do this traditional approaches like Bag of words, Tf-Idf will help but the latest Neural Word Embedding like Word2Vec, RNN Language model etc are the best techniques to obtain numeric representation of text.
Please use any Neural Word Embedding technique and convert the text(word level[word2vec], document level[doc2vec]) into numbers/vectors.
Now these vectors come with some dimension and to compress this representation to even smaller dimension u can use AutoEncoder.
Feel Free to ask any other information required.
Try using Python for these tasks as it has the latest packages.
You can use Autoencoder on Textual data as explained here.
Autoencoder usually worked better on image data but recent approaches changed the autoencoder in a way it is also good on the text data.
have a look at this.
the code is also available in GitHub.

adding input to doc2vec

I've recently started using word2vec and doc2vec methods. They are amazing!
But I want to play around with them a bit. As I compared the two methods I saw that the difference is that in doc2vec method, there is one extra input to the neural net, docMatrix. I want to add one more input to the neuralNet (which is a trained vector from somewhere else) and get the output vector for the document. Is it easy to do? Can someone help me to understand what exactly going on in word2vec code?
Thanks :)