How to project PCA features from train data to test data in spark scala? - scala

I read this link that explains: Anomaly detection with PCA in Spark
But whats the code to extract PCA features and project them from training data to test data?
From what I understood, we have to use the same set of features for train on test data.

Related

Incremental training of ALS model

I'm trying to find out if it is possible to have "incremental training of ALS model" on kinesis streaming data using MLlib in Apache Spark.
I have real time interaction of user from kinesis stream, but to get updated prediction results I need to train the model on whole data. This takes some time.
I'm trying to figure out if I can do incremental training of ALS model on streaming data, but cannot find an answer.
Incremental training of ALS

Implementation of knowledge flow environment in Weka using training and test data set

I like to compare the various ROC curves, which are build by various classifier using WEKA KNOWLEDGE FLOW platform. I have a training data set and a test data set. I want to build the model using training dataset and then want to supply test dataset to build the ROC curves. As per my understanding I have created a knowledge flow environment.
However, I am not sure about my implementation.

Usage of Libsvm model

I've developed a model using Libsvm in Matlab. I've choose best parameters using CV and I obtained the model training the whole dataset. I use normalization to get better results:
maximum=max(TR)+0.00001;
minimum=min(TR);
for i=1:size(TR,2)
training(1:size(TR,1),i)=double(TR(1:size(TR,1),i)-maximum(i))/(maximum(i)-minimum(i));
end
Now how can I use directly my model to obtain classification for new data? I mean for records that haven't class label. Do I have to manually build functions from model information?
Are you using libsvmtrain to train on your training data? If so, there is an output argument that you can use to classify test/future data. Then pass that output structure to svmpredict along with test data.

Libsvm vs Weka (WLSVM)

I've got to deal with an unbalanced dataset (95% record of negative class and 5% positive). I developed a model using decision tree and Weka framework. Now I'd like to try SVM and Libsvm to get better results. I'm trying to use Libsvm for matlab an Libsvm weka wrapper. I'd like to know how to compare results that I get from them. In weka a model is built from the whole dataset and after a 10-fold cross validation is performed. How can I do it with Libsvm? From Libsvm FAQ's I discovered that CV is made only to discover best parameters for kernels,not during train/predict, so what is the exact sequence of action that I should do in Matlab to obtain similar results in order to compare them with Weka?

svm classification

I am a beginner in MATLAB and doing my Programming project in Digital Image Processing,i.e. Magnetic Resonance image classification using wavelet features+SVM+PCA+ANN. I executed the example SVM classification from MATLAB tool and modified that to fit my requirements. I am facing problems in storing more than one feature in an input vector and in giving new input to SVM. Please help.
Simply feed multidimensional feature data to svmtrain(Training, Group) function as Training parameter (Training can be matrix, each column represents separate feature). After that use svmclassify(SVMStruct, Sample) for testing data classification.