How to interpret the output of Weka MultilayerPerceptron? - neural-network

I have a Multilayer Perceptron model in Weka and I want to extract knowledge from this output:
=== Classifier model (full training set) ===
...
Sigmoid Node 19
Inputs Weights
Threshold -0.1952207078426809
Attrib Age 6.055214343595766
Attrib Gender=Female 1.9806393961914877
Attrib Polyuria=Yes 2.092522712544858
Attrib Polydipsia=No -1.2458204564691266
Attrib sudden weight loss=Yes 0.4185898280097185
Attrib weakness=No -0.8314652455975647
Attrib Polyphagia=Yes -0.48400540426846483
Attrib Genital thrush=Yes -0.2226565451203396
Attrib visual blurring=Yes 3.0186785501154456
Attrib Itching=No 1.9350277038164228
Attrib Irritability=Yes -1.3543816020735406
Attrib delayed healing=No 1.862432846595033
Attrib partial paresis=Yes 1.0250701513525546
Attrib muscle stiffness=No 2.0216597800998932
Attrib Alopecia=No 0.5984702263543803
Attrib Obesity=No -1.704440363167018
Class Positive
Input
Node 0
Class Negative
Input
Node 1
How can I interpret this output (the nodes and the weights)?
Thanks in advance

You might have to dig into the source itself to understand how the output relates to the calculations underneath the hood. Here is a link to the toString() method of the MultilayerPerceptron classifier that generated this output.

Related

FastText quantize documentation incorrect?

I'm unable to run FastText quantization as shown in the documentation. Specifically, as shown at the bottom of the cheat sheet page:
https://fasttext.cc/docs/en/cheatsheet.html
When I attempt to run quantization on my trained model "model.bin":
./fasttext quantize -output model
the following error is printed to the shell:
Empty input or output path.
I've reproduced this problem with builds from the latest code (September 14 2018) and older code (June 21 2018). Since the documented command syntax isn't working, I tried adding an input argument:
./fasttext quantize -input [file] -output model
where [file] is either my training data or trained model. Unfortunately both tries resulted in a segmentation fault with no error message from FastText.
What is the correct command syntax to quantize a FastText model? Also, is it possible to both train and quantize a model in a single run of FastText?
Solution in Python:
# Quantize the model with retraining
model.quantize(input=train_data, qnorm=True, retrain=True, cutoff=200000)
# Save quantized model
model.save_model("model_quantized.bin")
I tried this one worked:
./fasttext quantize -input <training set> -output <model name (no suffix) -[options]
This is the example that is included in the quantization-example.sh
./fasttext quantize -output "${RESULTDIR}/dbpedia" -input "${DATADIR}/dbpedia.train" -qnorm -retrain -epoch 1 -cuto$

Vowpal Wabbit: Input of neural network?

In the machine learning tool vowpal wabbit (https://github.com/JohnLangford/vowpal_wabbit/), normally a linear estimator y*=wx is trained. However, it is possible to add a forward neural.
My question is: When I use the neural network by the command line option "-nn x", is the linear estimator wx completely replaced by an neural network?
Edit: Thanks Martin and arielf. So apperently the different constellations look like this:
The weights of the models with "--nn" are estimated by backpropagation?
[Edit: corrected answer: original wasn't accurate, thanks Martin]
The 1-layer NN feeds input features into the NN layer (all possible interactions) which are then fed to the output layer.
In order to add pass-through features as-is, without interactions, you should add the --inpass option.
You can look at models created by using --invert_hash to get a readable model on a small example:
$ cat dat.vw
1 | a b
2 | a c
# default linear model, no NN:
$ vw --invert_hash dat.ih dat.vw
...
$ cat dat.ih
...
:0
Constant:116060:0.387717
a:92594:0.387717
b:163331:0.193097
c:185951:0.228943
# Now add --nn 2 (note double-dash in long option)
# to use a 1-layer NN with 2 nodes
$ vw --nn 2 --invert_hash dat-nn.ih dat.vw
...
$ cat dat-nn.ih
...
:0
Constant:202096:-0.270493
Constant[1]:202097:0.214776
a:108232:-0.270493
a[1]:108233:0.214776
b:129036:-0.084952
b[1]:129037:0.047303
c:219516:-0.196927
c[1]:219517:0.172029
Looks like a[N] is the contribution of a to hidden-layer NN node N (starting with base/index zero apparently, the standalone a notation is a bit confusing).
When you add --inpass you get an additional weight per feature (index [2]):
$ vw --nn 2 --inpass --invert_hash dat-nn-ip.ih dat.vw
...
$ cat dat-nn-ip.ih
...
:0
Constant:202096:-0.237726
Constant[1]:202097:0.180595
Constant[2]:202098:0.451169
a:108232:-0.237726
a[1]:108233:0.180595
a[2]:108234:0.451169
b:129036:-0.084570
b[1]:129037:0.047293
b[2]:129038:0.239481
c:219516:-0.167271
c[1]:219517:0.139488
c[2]:219518:0.256326

Performing additional validation in LIBSVM matlab

I am working on MATLAB LIBSVM for a while to do prediction. I have a dataset out of which I use 75% for training, 15% for finding best parameters and remaining for testing. The code is given below.
trainX and trainY are the input and output training instances
testValX and testValY are the validation dataset I use
for j = 1:100
for jj = 1:10
model(j,jj) = svmtrain(trainY,trainX,...
['-s 3 -t 2 -c ' num2str(j) ' -p 0.001 -g ' num2str(jj) '-v 5']);
[predicted_label, ~, ~]=svmpredict(testValY,...
testValX,model(j,jj));
MSE(j,jj) = sum(((predicted_label-testValY).^2)/2);
end
end
[min_val,min_indi] = min(MSE(:));
best_predicted_model_rbf(i) = model(min_indi);
My question here is whether this is correct. I am creating model matrix with different values of c and g. I use -v option which is a key here. From the predicted models I use validation dataset for prediction and there by compute mean square error. Using this MSE I pick the best c and g. Since I am using -v which returns the cross validated output, is the procedure I follow correct?
First, I think there is a slight problem with the code shown, which is that num2str(jj) '-v 5']); doesn't have a space before the -v. That may cause that flag to not be read. In the other question, you stated that this 'sometimes returns a model', which is what would happen if that flag was not read. If the flag is read, you should only get a number, not a model, when the '-v' flag is used.
Second, it looks like you are doing two different things here, either one of which would be reasonable on its own. Calling svmtrain with '-v' runs cross validation on the training set. That shouldn't return a model, it should just return an mse estimate. You could use these estimates to determine which parameter setting was best, and then train one model with that setting on all of the training data.
Anyway, next you call svmpredict(y,x,model) on a hold-out validation set, testValX, but having called svmtrain with '-v', model should just be a scalar at this point. In order for this call to run correctly, you have to get the model from svmtrain without '-v', so that it is a struct. The rest of what you are doing makes sense for this case, in which you are doing hold-out validation using testValX.

Computing the value range for a netcdf 3D variable

I have a large series of netcdf files representing daily snapshots of data. I am hoping to hook these up to a software which is asking me to add to the namelist the maximum and minimum values for a variable in the files. How can I enquire about the maximum and minimum values stored in a variable?
My variable is depth (here is an excerpt from an ncdump for an idea of the size of that variable)
...
dimensions:
z = 40 ;
lat = 224 ;
lon = 198 ;
time = 1 ;
variables:
float depth(z, lat, lon) ;
depth:long_name = "cell centre depth" ;
depth:units = "m" ;
...
I'm still a beginner at handling these files, and have been using NCO operators and/or matlab for netcdf handling to date - is there an easy way to perform this max min enquiry using either of these tools?
Before now I have had netcdfs where the value range was helpfully displayed in the attributes or it has been a sufficiently small amount of data to be displayed easily with a simple ncdump -v look at the values or storing the variable in matlab which auto displays the max min, but now I have too many values to use these quick and dirty methods.
Any help is greatfully received.
All the best,
Bex
One NCO method would be to use the ncrng command, which is simply a "filter" for a longer ncap2 command:
zender#roulee:~/nco/data$ ncrng three_dmn_rec_var in.nc
1.000000 to 80.000000
So, it's a three word command. Documentation on filters is here.
If you have a newer version of MATLAB, try using the ncread function.
% Update with your filename and variable name below.
% This reads in the full variable into MATLAB
variableData = ncread(filename,varname);
% Query max and min values
minValue = min(variableData(:))
maxValue = max(variableData(:))
% you could also write this information back to the file for future reference.
% see https://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Attribute-Conventions.html
ncwriteatt(filename, varname, 'valid_range', [minValue, maxValue]);
% check result
ncdisp(filename, varname);
You could add two additional loops outside, one for looping through all your files and another for looping through all the variables in a file (look at ncinfo) to automate the whole thing.
The CDO method would be
cdo vertmax -fldmax in.nc max.nc
cdo vertmin -fldmin in.nc min.nc
The advantage is that you can calculate min/max just over x-y space (fldmax/fldmin), vertically (vertmax/min) or over time (timmax/min), or a combination of the three.
To dump the values from the netcdf to ascii you can use ncks
ncks -s '%13.9f\n' -C -H -v depth max.nc
To construct a namelist therefore you could for example write
echo min=`ncks -s '%13.9f\n' -C -H -v depth min.nc` >> namelist.txt
echo max=`ncks -s '%13.9f\n' -C -H -v depth max.nc` >> namelist.txt

How to clasify an unlabelled dataset with a newly trained NaiveBayes classifier in Weka

I have an unlabeled dataset that I want to classify with my newly trained classifier using NaiveBayes classification in Weka. So actually when in the Classify mode in weka if i give the option Supplied Test set, then it accepts the test set only if it is labelled and evaluates and gives the accuracy.
But what I want is to train it using a train.csv or train.arff file and then give it a new unseen and unlabelled test.csv or test.arff file and classify it and give it labels depending on classes in the training file. But if I provide an unlabelled file as test file to wweka it gives:
ERROR: Train and Test set not compatible
Sample format of my Train and test files are as below:
Train.csv file:
article story .......hockey class
1 0 ...... 0 politics
0 0 .......1 sports
.
.
.
.
. sports
and Test.csv file:
article story .......hockey class
0 1 ...... 0
1 0 .......1
.
.
.
.
.
So how do I classify an unlabelled dataset in Weka using NaiveBayes classifier??
It seems you are missing the class label. Weka requires training and test set to have the exact same attributes in the same order. Now there are two cases:
You know the classes of your test set
The performance is calculated by comparing the actual class labels with the predicted ones. You need to supply the class labels in your test set like you did in your training set.
You DON'T know the classes of your test set
To calculate a performance, Weka needs to compare the predicted classes with the actual classes. If you don't have the actual classes, you cannot calculate the performance. You can only predict classes.
You have to add a class label with missing values for your test instances if you just want prediction.
Even if your test set is labelled, Weka will not see it at first stage. It will use the classifier you developed with training data and then will apply the classifier on the test set you supply. The classifier then predicts each instance class and Weka then keeps track of a correct or incorrect classification. So, what you are doing here is exactly what you are trying to achieve. The error is telling that the training and test sets are not compatible because I believe you have removed the "class" label from the test set. Don't worry. Keep it as it is and the accuracy you are getting from Weka is the actual performance of the classifier. Hope that helps.
you cant leave it all empty, you need to set at least one each class label on the class field (as some kind of "clue" for the weka)
article story .......hockey class
0 1 ...... 0 politics
1 0 .......1 sport
1 1 .......1 ?
1 1 .......1 ?
the two first row will provide weka an example of the prediction class. Then you can predict as much as instance with no class (?) using your trained model