Classification of new instances in weka - classification

In our training set, we performed feature selection (ex. CfsSubsetEval GreedyStepwise) and then classified the instances using a classifier (ex. J48). We have saved the model Weka created.
Now, we want to classify new [unlabeled] instances (which still has the original number of attributes of the training set before it went under feature selection). Are we right in assuming that we should perform the feature selection in this set of new [unlabeled] instances so we could re-evaluate it using the saved model (to make the training and test sets compatible)? If yes, how can we filter the test set?
Thank you for helping!

Yes, both test and training set must have the same number of attributes and each attribute must correspond to the same thing. So you should remove the same attributes (that you removed from training set) from your test set before classification.

I don't think you have to perform feature selection on the test set. If your test set already has the original number of attributes, upload it, and in the "preprocess" window, manually remove all the attributes that were removed during the feature selection in the training set file.

You must apply the same filter to the test set , that you have previously applied to the training set. You can use the WEKA API for applying the same filter to the test set as well.
Instances trainSet = //get training set
Instances testSet = //get testing set
AttributeSelection attsel = new AttributeSelection();//apply feature selection on training data
CfsSubsetEval ws = new CfsSubsetEval();
GreedyStepwise search = new GreedyStepwise();
attsel.setEvaluator(ws);
attsel.setSearch(search);
attsel.SelectAttributes(trainSet);
retArr = attsel.selectedAttributes();//get indicies of selected attributes
Filter remove = new Remove() //set up the filter for removing attributes
remove.setAttributeIndicesArray(retArr);
remove.setInvertSelection(true);//retain the selected,remove all others
remove.setInputFormat(trainSet);
trainSet = Filter.useFilter(trainSet, remove);
//now apply the same filter to the testing set as well
testSet = Filter.useFilter(testSet, remove);
//now you are good to go!

Related

Change State on model runtime

P.S. This Question has been edited to answer questions made by #Felipe
I have an Agent-Based model simulation for churn behavior modeling. On each iteration(based on time--month) each user reconsiders her choice of operator(our or other) based on model metrics (Cost/SocialNetwork/...). In runtime even when I change parameters to affect Agents' decision, no one changes his/her operator. here is my state chart image on the below:
I should note that internal transition of (our user) has below details:
the first two lines are something for display. Advocate() refers to the action of sending messages which affects social influence.
But Switch() is where decision happens based on new parameters' value. In short, d defines a normalized range between -1 and 1 : signum(d) predicts which provider is the preferred one and abs(d) shows how preferred the selected provider will be.
//Definition for Switch()
double d = (this.Social_impact()/20)+this.Monthly_Charge_Impact();
if (d>0)
SwitchToUs();
else
SwitchToOther();
the two SwitchToUs and SwitchToOther functions simply change the operator (as if creating arrows between OUR_USER and OTHER_USER states)

Is it possible to load a model which is stored with model.module.state_dict() but load with model.state_dict()

I want to ask a question, I have trained a model with two gpus and stored this model with
model.module.state_dict(), now I want to load this model in one gpu, can I directly load this trained model with model.state_dict()?
Thanks in advance!
You can refer to this question.
You can either adding an nn.DataParallel for loading purpose. Or change the key naming like
# original saved file with DataParallel
state_dict = torch.load('myfile.pth.tar')
# create new OrderedDict that does not contain `module.`
from collections import OrderedDict
new_state_dict = OrderedDict()
for k, v in state_dict.items():
name = k[7:] # remove `module.`
new_state_dict[name] = v
# load params
model.load_state_dict(new_state_dict)
But as you save the model with model.module.state_dict() instead of model.state_dict() it possible that the names may differed. If the two methods above don't work try print the saved dict and model to see what you need to change. Like
state_dict = torch.load('myfile.pth.tar')
print(state_dict)
print(model)

FMU 2.0 interaction - requires parallel "container" for parameter values etc?

I work with pyfmi in Jupyter notebooks to run simulations and I like to work interactively and evaluate incremental changes in parameters etc. Long time ago I found it necessary to introduce a dictionary that work as a "container" for parameter and initial values. Now I wonder if here is a way to get rid of this "container" that after all is partly a parallel structure to "model"?
A typical workflow look like this:
create a diagram where results from different simulations below should be shown
model = load_fmu(fmu_model)
parDict['model.x_0'] = 1
parDict['model.a'] = 2
for key in parDict.keys(): model.set(key,parDict[key])
sim_res = model.simulate(10)
plot results...
model = load_fmu(fmu_model)
parDict['model.x_0'] = 3
for key in parDict.keys(): model.set(key,parDict[key])
sim_res = model.simulate(10)
plot results...
There is a function model.reset() that brings the state back to default values at compilation without loading again, but you need to do more than the following
model.reset()
parDict['model.x_0'] = 3
for key in parDict.keys(): model.set(key,parDict[key])
sim_res = model.simulate(10)
plot results...
So,  this does NOT work...
and after all parameters and initial values needs to be brought back and we still need parDict, but we may avoid the load-command though.

ag-grid: setFilterModel() unable to apply filter to async/callback set filter

ag-grid's asynchronous set filters. These provide significant speed increases and lower transmission payloads for our clients, a very valuable feature. However, we also invoke .setFilterModel in onGridReady to load cached and saved filter configurations. These two features are unable to operate in tandem.
STEPS TO REPRODUCE Method:
Open https://embed.plnkr.co/hhgPgNM2plVpIQbB5aGj/
Select Filter icon on Set filter col column Wait for Set Filter to populate Click Apply Filter using setFilterModel() button.
Observe (function behaves as expected)
How Can setFilterModel() initiates values callback function, on success filter model is applied ? or please suggest how can I use synchronous callbacks instead of asynch issue.Thanks
I played around with the plunker and modified the applyFilter() slightly and this works.
Basically you need to notify ag-grid that you have applied the filter -
function applyFilter(){
// get instance of set filter
var valueFilterComponent = gridOptions.api.getFilterInstance('value');
// use api to select value
valueFilterComponent.selectValue('value 1');
// let ag-grid know that filter was applied
valueFilterComponent.onFilterChanged();
}
More on set filters here

How to monitor error on a validation set in Chainer framework?

I am kind of new to Chainer and have written a code which trains a simple feed forward neural network. I have a validation set and a train set and want to test on the validation set on each like 500 iterations and if the results are better I want to save my network weights. Can anyone tell me how can I do that?
Here is my code:
optimizer = optimizers.Adam()
optimizer.setup(model)
updater = training.StandardUpdater(train_iter, optimizer, device=0)
trainer = training.Trainer(updater, (10000, 'epoch'), out='result')
trainer.extend(extensions.Evaluator(validation_iter, model, device=0))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/loss', 'validation/main/loss', 'elapsed_time']))
trainer.run()
Error on validation set
It is reported by Evaluator, and printed by PrintReport. Thus it should be shown with your code above. And to control the frequency of execution of these extentions, you can specify trigger keyword argument in trainer.extend function.
For example, below code specifies printing each 500 iteration.
trainer.extend(extensions.PrintReport(['epoch', 'main/loss', 'validation/main/loss', 'elapsed_time']), trigger=(500, 'iteration'))
You can also specify trigger to Evaluator.
Save network weights
You can use snapshot_object extension.
http://docs.chainer.org/en/stable/reference/generated/chainer.training.extensions.snapshot_object.html
It will be invoked every epoch as default.
If you want to invoke it when the loss improves, I think you can set trigger using MinValueTrigger.
http://docs.chainer.org/en/stable/reference/generated/chainer.training.triggers.MinValueTrigger.html