Why can't I save using Save Data or Save Model? - orange

I cannot seem to save any of my models, nor any of the data created from their training. What am I missing? The input is >15,000 images created in ~200 classes.

Your Save Data (3) widget does not save anything since widget does not get any data at the input - note dashed line connecting Logistic Regression and Save Widget (3). Logistic regression does not output any data since it does not get any data at the input.
You neither can save the models this way since they are not trained on data. In your example, they just provide the learner to the Test and Score widget. In order to use the Save Model widget, the previous widget (e.g. Logistic regression) must get the data at the input which it uses for training. See the example in the documentation.

Related

How to perform simple linear regression with Orange

I'm learning Orange and I want to perform a super simple task: simple linear regression with made up data points. I want to start from scratch, using data generated with the Paint Data tool. This is the workflow I have created.
What's wrong with this approach? Why do I get the error? It must be somehow related to the use of Paint Data tool, that perhaps requires some sort of processing? I inspected the data with Scatter Plot and it looks as expected.
Reading the error message could help. :) It says "Training data requires a target variable". In a more statistical terminology, Paint data outputs two independent variables, and although they are named x and y, and y is a name we often use for dependent variable, Test and Score won't do the guesswork.
Insert a "Select Columns" widget between Paint Data and Test and Score, and drag "y" to "Target variable".

Predictions using Convolutional Neural Networks and DL4J

This is my first time working with DL4J (Deep Learning for Java) and also my first Convolutional Neural Network. My Goal is to use the Convolutional Neural Netowrk to give me some predicted values about an image. I gathered and labelled my images myself. The labels or expected outputs consist of two numbers between 0 and 1 (I just wrote them in the file name like 0.01x0.87.jpg).
Now I can't find any way to use the DataSetIterator Class which DL4J uses so that I can also set my label values.
Is there a simple way to tell DL4J that I want to train my Network to recognize that image 0.01x0.01.jpg should spit out the values 0.01 and 0.01?
What you want to do is usually known as regression. In contrast to classification where you want to either have a 0 or 1 output, in regression any value can be the target.
In your case, you will likely want to use a network architecture that uses either a sigmoid (which forces your values to be between 0 and 1) or an identity (which keeps the values as is, i.e. allows for them to be outside of the 0 to 1 range) activation function.
As you have two values that you are trying to predict, you will have to also define that you are using two outputs.
So much for your model architecture.
For data loading, you can use the ImageRecordReader, but also pass it a PathMultiLabelGenerator of your own. When you implement the PathMultiLabelGenerator interface, you will get the full path of the image as a string, and you can do whatever you want with it, like for example remove the file ending, split on x and parse your filename into a list of DoubleWritable. DoubleWritable is just a simple wrapper class for double so creating that is as easy as just instantiating it by passing the actual value to the constructor.
To create a dataset iterator you can now follow the documentation on RecordReaderDataSetIterator.

Spark-mllib retraining saved models

I am trying to make a classification with spark-mllib, especially using RandomForestModel.
I have taken a look on this example from spark (RandomForestClassificationExample.scala), but I need a somewhat expanded approach.
I need to be able to train a model, save the model for future usage, but also to be able to load it and train further. Like, extend the dataset and train again.
I completely understand the need to export and import a model for future usage.
Unfortunately, training "further" isn't possible with Spark nor does it make sense. Thus it's recommended to retrain the model with the data from use to train the first model + new data.
Your first training values/metrics don't have much sense anymore if you want to add more data (e.g features, intercept, coefficients, etc.)
I hope that this answers your question.
You may need to look for some reinforcement learning technique instead of Random Forest if you want to use the old model and retrain it with new data.
That I know, there's deeplearning4j that implements deep reinforcement learning algorithms on top of Spark (and Hadoop).
If you only need to save JavaRDD[Object], you can do (in Java)
model.saveAsObjectFile()
Values will be writter out using Java Serialization. Then, to read your data you do:
JavaRDD[Object] model = jsc.objectFile(pathOfYourModel)
Be careful, object files are not available in Python. But you could use saveAsPickleFile() to write your model and pickleFile() to read it.

getting paragraph representation for unseen paragraphs in doc2vec

I would like to use genism doc2vec model for a classification task.
However, It seems like the gensim implementation of doc2vec requires to see all documents (train and test) to build the vocabulary before training the model. Otherwise, you get keyerror if you want to get document vector of a document that was not present when building the vocabulary. I wonder if my understanding is correct! In practice, one does not have access to the test data at the time of training.
Is there any way to update the vocabulary at the test time to be able to get document representation of test documents?
You can only look-up learned document-vectors for material that was presented during training.
But, there is a method infer_vector() which can provide a new tokenized document to the the frozen, trained model, and return a 'best-fit' vector. It approximates what would have been returned if the new document was available during training. See:
https://radimrehurek.com/gensim/models/doc2vec.html#gensim.models.doc2vec.Doc2Vec.infer_vector

Is it possible to set initial state to a simulink model to do simulations?

Consider that I have built an electrical circuit or any other system at Simulink and to do simulations, Simulink should work in the sense that it builds a state space model of the system, right? If that is the case, is it possible to set an initial condition of this model? And more, is it possible to know what are the state variables of the model built by Simulink?
The Simulink.BlockDiagram.getinitialState method can be used to interrogate the model, and return an appropriate structure giving the current initial value of the states.
The values in the structure can then be changed and the (new) values used with the model configuration parameters to start at a different initial state. See the doc for a usage example.