Multiple output classification in Google AutoML Table - classification

Is it possible to train a multiple output classification model using Google AutoML Table? If so, how can I predict a vector and not a scalar of multi-label classes with the AutoML Table? Concatenating the vector of output classes to get a single scalar as a workaround is not an alternative.

Unfortunately, currently this is not available. Please feel free to use "SEND FEEDBACK -> Product feedback" in the web page to send feedback for the feature request.

Related

Does Scikit Learn SelectKBest group continous data itself?

I want to use the SelectKBest Scikit Learn algorithm (with chi-square test) to reduce the number of features in my ML setup. Unfortunately my feature-data is continous. So my question is, if scikit groups this data itself or is this my job? On the one hand in most cases the input data is continous and chi-sqaure-test seems to be an often used method. On the other hand there is no possibility to define a number of groups.
Thanks
Prickles

Predict a number with a given image (0 to 1)

I am a total beginner to ML and Neural networks. I am currently working on a project where I have a lot of pictures stored in a MongoDB database. Each one of those pictures has a number from 0 to 1. For example "picture 1" 0.71.
I want to train my model given the database. The main goal for the project is that after the model is finished and trained, given an image the model will be able to return(predict) a number from 0 to 1. After doing some research and asking a few people I figured out some libraries that would be useful for the project are: Tenserflow and Keras. Some people told me that it is impossible, but I'm not sure therefore I came to ask here.
So my questions are: Is it possible? If so, how can I implement it? Are there any specific tools you recommend? If you specify a way that I should use for my project do I need to export my MongoDB database in a certain form? Since I am a beginner maybe there are some tutorials that you think that can help?
I'm sorry if this question is a bit too general, if there are any misunderstandings please comment and I will try to answer.
Thanks in advance!
What you want to do is totally feasible, this kind of project is called regression, since you are using images data the best type of models are called convolutional neural network (CNN), you'll need some understanding if you want to build your own model. I've done a project where I had to predict a number of bacterial colonies using an image, much like your problem except that I had no boundaries on the predicted values.
What is a CNN ? Here is a link
Basically a CNN will understand the features in the images and will use those features to predict a value.
You won't need to create your own model, most people just use well-designed one in the scientific litterature.
Go for keras, it's the easiest framework out there and work like a charm. Here is how to implement VGG16 (an architecture that is probably the best for your problem) : link
You should follow this tutorial to get going on developing with keras.
Last hint: don't use the same last layer as the one on the VGG16 implementation, use a Dense Layer with one neuron and with a sigmoid/linear/leaky relu activation.
ie:
#model.add(Dense(1000, activation='softmax'))
model.add(Dense(1, activation='sigmoid'))
This means : predict 1 number (sigmoid will bound it between 0 and 1, but maybe lrelu or linear is better)
Also, I guess you could use MongoDB to read the images as arrays, but I would just put the images on a folder.
Edit : When compiling the model, use a mean squared error as in
adam = keras.optimizers.Adam(lr=1e-4)
model.compile(optimizer=adam, loss='mse')
Here you have the "hello world program" in terms of neural networks and digits classification. You can start studying it because I think you will end up with a similar architecture for your NN. What you should focus on is the output of your model, because in this example they are performing classification on 10 classes (digits from 0 to 9) but you are trying to read a real number. You could try to use a single neurone with sigmoid or linear activation at the end of your model.

Word to vector where should I start?

I'm trying to implement a neural networks model on labeled data that I have. The data contains several columns (categorical and numeric features as well).
Few columns in this data contains a short description, written by users which I also want to analyze but I don't know how to start.
The data looks something like this:
problem ID status description labels
1 closed short description of the problem CRM
2 open short description of the problem ERP
3 closed short description of the problem CRM
Using status (which I will convert into dummy variables) and description (this is where I need you guys), I want to train the model to predict the labels.
Any idea about how should I start? How can I convert the description columns into a useful data?
Thanks!
You want to do the classification basically based on the features, for categorical variables encode them into some trainable form. for text first, perform cleaning, if that has more numbers then convert numbers into their words form and make vectors for it using tf-idf or any other vectorization approach, also normalize your numerical features and then train a simple svm classifier with it, if not giving good accuracy then go with CNN and LSTM based neural network, you can also try CNN+Embeddings for better results.

How to use word2vec and RNN together?

So, I am using word2vec in Java, and trying to train it somehow so that it gives me vector representation for words and sentences.
Can I use this for feeding input into a neural network, to get a response on the basis of the word2vec data? I am planning to make a chatbot with the help of this.
Adding on to #galloguille's comments, you can use pre-trained word2vec's word-vectors to initialize your RNN. RNN can learn from sequence of words to predict next word(s). A good example with code for this, you can find here - https://github.com/larspars/word-rnn.
There is good collection of current state of the art on chatbots here - https://stanfy.com/blog/the-rise-of-chat-bots-useful-links-articles-libraries-and-platforms/
From my understanding, most effective chatbots don't use RNN directly (at present) to reply to a question, but try to predict intent (from a fixed set of intents) of the question in the first step. Based on each intent, they calculate some actionable insights and a logical reply to the question.

can we use autoencoders for text data

I am doing my project based on health care.I am going to train my autoencoders with the symptoms and the diseases i.e my input is in textual form. Will that work? (I am using Rstudio).Please anyone help me with this
You have to convert the text to vectors/numbers. To do this traditional approaches like Bag of words, Tf-Idf will help but the latest Neural Word Embedding like Word2Vec, RNN Language model etc are the best techniques to obtain numeric representation of text.
Please use any Neural Word Embedding technique and convert the text(word level[word2vec], document level[doc2vec]) into numbers/vectors.
Now these vectors come with some dimension and to compress this representation to even smaller dimension u can use AutoEncoder.
Feel Free to ask any other information required.
Try using Python for these tasks as it has the latest packages.
You can use Autoencoder on Textual data as explained here.
Autoencoder usually worked better on image data but recent approaches changed the autoencoder in a way it is also good on the text data.
have a look at this.
the code is also available in GitHub.