Related
Before I start describing my problem, I would like to note that this question is for a project for one of my courses at University, so I do not seek for the solution, rather for a hint or an explanation.
So, lets assume that there are 3 states {1,2,3} and I also have the Transition probability Matrix (3x3). I wrote a matlab script that based on the transition matrix, it creates a vector with N samples for the Markov Chain. Assume that the first state is the state 1. Now, I need to Huffman code this chain based on the conditional distribution pXn |Xn−1 .
If I am not mistaken, I think that I have to create 3 Huffman dictionaries and encode each symbol from the chain above, based on the previous state(?), which means that each symbol is going to be encoded with one out of the three dictionaries I created, but not all of them with the same dictionary.
If the encoding process is correct, how do I decode the coded vector?
I am not really sure if that's how it should be done.
Any ideas would be appreciated.
Thanks in advance!
That's right. There would be a Huffman code for the three symbols p11, p12, and p13, another for p21, p22, p23, etc.
Decoding chooses which code to use based on the current state. There needs to either be an assumption for the starting state, or the starting state needs to be transmitted.
However this case is a little odd, since there is only one Huffman code for three symbols, consisting of 1 bit, 2 bits, and 2 bits. E.g. 0, 10, 11. So the only gain you get is by picking the highest probability for the one-bit symbol.
Well, having solved the problem above, I decided to post the answer with the octave script in case anyone needs it in future.
So, lets assume that there are 5 states {1,2,3,4,5} and I also have the Transition probability Matrix (5x5). I Huffman encoded and decoded the Markov chain for 1000 Monte Carlo experiments.
The Octave Script is:
%starting State of the chain
starting_value = 1;
%Chain Length
chain_length = 100;
%# of Monte Carlo experiments
MC=1000;
%Variable to count all correct coding/encoding experiments
count=0;
%Create unique symbols, and assign probabilities of occurrence to them.
symbols = 1:5;
p1 = [.5 .125 .125 .125 0.125];
p2 = [.25 .125 .0625 .0625 0.5];
p3 = [.25 .125 .125 .25 0.25];
p4 = [.125 0 .5 .25 0.125];
p5 = [0 .5 .25 .25 0];
%Create a Huffman dictionary based on the symbols and their probabilities.
dict1 = huffmandict(symbols,p1);
dict2 = huffmandict(symbols,p2);
dict3 = huffmandict(symbols,p3);
dict4 = huffmandict(symbols,p4);
dict5 = huffmandict(symbols,p5);
% Create the transition matrix for each state
T= [0.5 0.125 0.125 0.125 0.125;
0.25 0.125 0.0625 0.0625 0.5;
0.25 0.125 0.125 0.25 0.25;
0.125 0 0.5 0.25 0.125 ;
0 0.5 0.25 0.25 0];
%Initialize Marcov Chain
chain = zeros(1,chain_length);
chain(1)=starting_value;
for i=1 :MC
comp=[];
dsig=[];
%Create Markov Chain
for i=2:chain_length
this_step_distribution = T(chain(i-1),:);
cumulative_distribution = cumsum(this_step_distribution);
r = rand();
chain(i) = find(cumulative_distribution>r,1);
end
comp=huffmanenco(chain(1),dict1);
%Encode the random symbols.
for i=2:chain_length
if chain(i-1)==1
comp = horzcat(comp,huffmanenco(chain(i),dict1));
elseif chain(i-1)==2
comp = horzcat(comp,huffmanenco(chain(i),dict2));
elseif chain(i-1)==3
comp = horzcat(comp,huffmanenco(chain(i),dict3));
elseif chain(i-1)==4
comp = horzcat(comp,huffmanenco(chain(i),dict4));
elseif chain(i-1)==5
comp = horzcat(comp,huffmanenco(chain(i),dict5));
end
end
%Decode the data. Verify that the decoded data matches the original data.
dsig(1)=starting_value;
comp=comp(length(dict1{1,1})+1:end);
for i=2:chain_length
if dsig(end)==1
temp=huffmandeco(comp,dict1);
comp=comp(length(dict1(temp(1)){1,1})+1:end);
elseif dsig(end)==2
temp=huffmandeco(comp,dict2);
comp=comp(length(dict2(temp(1)){1,1})+1:end);
elseif dsig(end)==3
temp=huffmandeco(comp,dict3);
comp=comp(length(dict3(temp(1)){1,1})+1:end);
elseif dsig(end)==4
temp=huffmandeco(comp,dict4);
comp=comp(length(dict4(temp(1)){1,1})+1:end);
elseif dsig(end)==5
temp=huffmandeco(comp,dict5);
comp=comp(length(dict5(temp(1)){1,1})+1:end);
end
dsig=horzcat(dsig,temp(1));
end
count=count+isequal(chain,dsig);
end
count
The "variable" count is to make sure that in all of the MC experiments, the Markov Chain that was produced was properly encoded and decoded. (Obviously, if count equals to 1000, then all the experiments had correct results)
I am training my neural network with data from 3 consecutive days and testing it with data from a 4th day. The values in this example are randomly chosen and have no relation with reality. I want the neural network to learn the current, depending on the temperature and the solar radiation.
%% initialize data for training
Temperature_Day1 = [25 26 27 26 25];
Temperature_Day2 = [25 24 24 23 24];
Temperature_Day3 = [21 20 22 21 20];
SolarRadiation_Day1 = [990 944 970 999 962];
SolarRadiation_Day2 = [993 947 973 996 967];
SolarRadiation_Day3 = [993 948 973 998 965];
Current_Day1 = [0.11 0.44 0.44 0.45 0.56];
Current_Day2 = [0.41 0.34 0.43 0.55 0.75];
Current_Day3 = [0.34 0.98 0.34 0.76 0.71];
Day1 = [Temperature_Day1; SolarRadiation_Day1]; % 2-by-5
Day2 = [Temperature_Day2; SolarRadiation_Day2]; % 2-by-5
Day3 = [Temperature_Day3; SolarRadiation_Day3]; % 2-by-5
%% training input and training target
Training_Input = [Day1; Day2; Day3]; % 6-by-5
Training_Target = [Current_Day1; Current_Day2; Current_Day3]; % 3-by-5
%% training the network
hiddenLayers= 2;
net = newcf(Training_Input, Training_Target, hiddenLayers);
y = sim(net, Training_Input);
net.trainParam.epochs = 100;
net = train(net, Training_Input, Training_Target);
%% initialize data for prediction
Temperature_Day4 = [45 23 22 11 24];
SolarRadiation_Day4 = [960 984 980 993 967];
Current_Day4 = [0.14 0.48 0.37 0.46 0.77];
Day4 = [Temperature_Day4; SolarRadiation_Day4]; % 2-by-5
Test_Input = [Day4; Day4; Day4]; % same dimension as Training_Input; subject to question
%% prediction
Predicted_Target = sim(net, Test_Input); % yields 3-by-5
My question is: How do I train it with the data of 3 days and then predict the target of the 4th day? Since training and testing inputs must have the same dimension, how do I test it for only one day? Here it is solved by just concatenating three identical data sets of the test input. However, this also yields 3 different data sets for the predicted target.
What is here the right way to do it?
BTW: I have seen this type of question many times, but the answers are never satisfying because they always suggest to change the dimensions of the test input without considering the nature of the problem (which is that only one data set is available for testing). So please don't mark this as a duplicate.
The features that you have for your network are Temperature and SolarRadiation, each taken at specific times during the day. The day on which these readings are taken are irrelevant (otherwise you wouldn't be able to predict the outputs for day 4 given data for days 1-3).
This means that we can simply pass each observation separately by concatenating the days horizontally (and similarly for the target data):
Training_Input = [Day1, Day2, Day3]; % 2-by-15
Training_Target = [Current_Day1, Current_Day2, Current_Day3]; % 1-by-15
The resulting network will give you one output (Current) per observation in the test set, so you don't need to duplicate:
Day4 = [Temperature_Day4; SolarRadiation_Day4]; % 2-by-5
Test_Input = [Day4]; % 2-by-5
PredictedTarget will now be 1-by-5 showing the predicted Current for each of the test observations.
You might consider adding a third feature as input to your net representing the time at which each observations was taken. Assuming that you have t timeslots each day at which observations are taken (thus, length(Temperature) == length(SolarRadiation) == t for all days) and observation s is taken at the same time every day, you can add a feature called TimeSlot:
TimeSlot_Day1 = 1:numel(Temperature_Day1);
TimeSlot_Day2 = 1:numel(Temperature_Day2);
TimeSlot_Day3 = 1:numel(Temperature_Day3)];
Day1 = [Temperature_Day1; SolarRadiation_Day1; TimeSlot_Day1]; % 3-by-5
Day2 = [Temperature_Day2; SolarRadiation_Day2; TimeSlot_Day2]; % 3-by-5
Day3 = [Temperature_Day3; SolarRadiation_Day3; TimeSlot_Day3]; % 3-by-5
I use libsvm to train a svm model in matlab, but when I call
model = svmtrain(labels,Feature,'-t 0');
It gives me this result:
*
optimization finished, #iter = 1
nu = nan
obj = nan, rho = nan
nSV = 0, nBSV = 0
Total nSV = 0
My positive and negative samples are of almost equal number: 935 vs 904 so this problem is not caused by unbalanced training dataset. Also I tried other kernels and none of them work.
You do not want to use svmtrain any more. The new version is templatesvm paired with fitcecoc. The datapages on both functions are quite extensive.
You'll eventually want to use your model to predict other data, use predict for that.
I recently encoutered similar problems when trying to classify terrain in a point cloud with more than two classes. templatesvm and fitcecoc solved my problems.
The code I used is as follows, where trainingdata is my 5 dimensional training data and groups contains the label for each class, which correspond to the cell array classes.
SVMtemp = templateSVM('KernelFunction','polynomial','IterationLimit',1e4,...
'PolynomialOrder',4,'OutlierFraction',ExpOut,...
'Standardize',true); % Create SVM template
Model = fitcecoc(trainingdata(:,4:8),groups,'learners',SVMtemp,'ClassNames',...
classes); % Create the SVM model
I achieved the computation of the original HMAX model, and I get the results at C2 layer. Now I still have the tuned-layer, in other words, to use the osusvm.
In my project, I have two directories. One containing the training images and other containing the test images.
Reference: lennon310's response in Training images and test images
Firstly, I would like to show you my results at C2 layer (surely that the results should be a vectors). Notice that I extracted only 2 prototypes in the S2 layer (in my project I used 256 prototypes, but only in this question, assume that I used only 2 prototypes), and four prototypes sizes:[4 8 12 16]. So for each image, we get 8 C2 units (2 prototypes x 4 patch sizes = 8).
C2res{1}: For the six training images:
0.0088 0.0098 0.0030 0.0067 0.0063 0.0057
0.0300 0.0315 0.0251 0.0211 0.0295 0.0248
0.1042 0.1843 0.1151 0.1166 0.0668 0.1134
0.3380 0.2529 0.3709 0.2886 0.3938 0.3078
0.2535 0.3255 0.3564 0.2196 0.1681 0.2827
3.9902 5.3475 4.5504 4.9500 6.7440 4.4033
0.8520 0.8740 0.7209 0.7705 0.4303 0.7687
6.3131 7.2560 7.9412 7.1929 9.8789 6.6764
C2res{2}: For the two test images:
0.0080 0.0132
0.0240 0.0001
0.1007 0.2214
0.3055 0.0249
0.2989 0.3483
4.6946 4.2762
0.7048 1.2791
6.7595 4.7728
Secondly, I downloaded the osu-svm matlab toolbox and I added its path:
addpath(genpath('./osu-svm/')); %put your own path to osusvm here
useSVM = 1; %if you do not have osusvm installed you can turn this
%to 0, so that the classifier would be a NN classifier
%note: NN is not a great classifier for these features
Then I used the code below:
%Simple classification code
XTrain = [C2res{1}]; %training examples as columns
XTest = [C2res{2}]; %the labels of the training set
ytrain = [ones(size(C2res{1},2),1)];%testing examples as columns
ytest = [ones(size(C2res{2},2),1)]; %the true labels of the test set
if useSVM
Model = CLSosusvm(XTrain,ytrain); %training
[ry,rw] = CLSosusvmC(XTest,Model); %predicting new labels
else %use a Nearest Neighbor classifier
Model = CLSnn(XTrain, ytrain); %training
[ry,rw] = CLSnnC(XTest,Model); %predicting new labels
end
successrate = mean(ytest==ry) %a simple classification score
Does the code just above is true ? Why I always get a successrate=1 ? I think that I am wrong in some places. Please I need help. If it is true, does another way to compute that ? What can I use instead of successrate in order to get more sexy results?
Note:
The function CLSosusvm is :
function Model = CLSosusvm(Xtrain,Ytrain,sPARAMS);
%function Model = CLSosusvm(Xtrain,Ytrain,sPARAMS);
%
%Builds an SVM classifier
%This is only a wrapper function for osu svm
%It requires that osu svm (http://www.ece.osu.edu/~maj/osu_svm/) is installed and included in the path
%X contains the data-points as COLUMNS, i.e., X is nfeatures \times nexamples
%y is a column vector of all the labels. y is nexamples \times 1
%sPARAMS is a structure of parameters:
%sPARAMS.KERNEL specifies the kernel type
%sPARAMS.C specifies the regularization constant
%sPARAMS.GAMMA, sPARAMS.DEGREE are parameters of the kernel function
%Model contains the parameters of the SVM model as returned by osu svm
Ytrain = Ytrain';
if nargin<3
SETPARAMS = 1;
elseif isempty(sPARAMS)
SETPARAMS = 1;
else
SETPARAMS = 0;
end
if SETPARAMS
sPARAMS.KERNEL = 0;
sPARAMS.C = 1;
end
switch sPARAMS.KERNEL,
case 0,
[AlphaY, SVs, Bias, Parameters, nSV, nLabel] = ...
LinearSVC(Xtrain, Ytrain, sPARAMS.C);
case 1,
[AlphaY, SVs, Bias, Parameters, nSV, nLabel] = ...
PolySVC(Xtrain, Ytrain, sPARAMS.DEGREE, sPARAMS.C, 1,0);
case 2,
[AlphaY, SVs, Bias, Parameters, nSV, nLabel] = ...
PolySVC(Xtrain, Ytrain, sPARAMS.DEGREE, sPARAMS.C, 1,sPARAMS.COEF);
case 3,
[AlphaY, SVs, Bias, Parameters, nSV, nLabel] = ...
RbfSVC(Xtrain, Ytrain, sPARAMS.GAMMA, sPARAMS.C);
end
Model.AlphaY = AlphaY;
Model.SVs = SVs;
Model.Bias = Bias;
Model.Parameters = Parameters;
Model.nSV = nSV;
Model.nLabel = nLabel;
Model.sPARAMS = sPARAMS;
The function CLSosusvmC is :
function [Labels, DecisionValue]= CLSosusvmC(Samples, Model);
%function [Labels, DecisionValue]= CLSosusvmC(Samples, Model);
%
%wrapper function for osu svm classification
%Samples contains the data-points to be classified as COLUMNS, i.e., it is nfeatures \times nexamples
%Model is the model returned by CLSosusvm
%Labels are the predicted labels
%DecisionValue are the values assigned by the Model to the points (Labels = sign(DecisionValue))
[Labels, DecisionValue]= SVMClass(Samples, Model.AlphaY, ...
Model.SVs, Model.Bias, ...
Model.Parameters, Model.nSV, Model.nLabel);
Labels = Labels';
DecisionValue = DecisionValue';
Your code looks good to me.
Since you have only 2 test images, the possible successful rate will be limited to 0, 0.5, 1. And it is expected to achieve the 100% accuracy with a 25% probability ([0 1],[1 0],[1 1],[0 0]). You can shuffle the data, and re-select 2 from the 8 as the test for some times, then observe the accuracy.
Also try to add more images to both training and test samples.
Machine learning has little sense on the set of 8 images. Gather at least 10x more data, and then analyze the results. With such small dataset any results are possible (from 0 to 100 percent), and none of them is realiable.
Meanwhile you can try to perform repeated cross validation:
Shuffle your data
Split it to two-element parts ( [1 2][3 4][5 6][7 8] )
For each of such parts:
a) test on it, while training on the rest, so for example:
train on [3 4 5 6 7 8] and test on [1 2]
b) record the mean score
Repeat the whole process and report the means score
can someone help me to solve this?
I want to test whether this classification is already good or not. So, I try with data testing=data training. it will give 100% (acc) if the classification is good.
this is the code that I found from this site:
data= [170 66 ;
160 50 ;
170 63 ;
173 61 ;
168 58 ;
184 88 ;
189 94 ;
185 88 ]
labels=[-1;-1;-1;-1;-1;1;1;1];
numInst = size(data,1);
numLabels = max(labels);
testVal = [1 2 3 4 5 6 7 8];
trainLabel = labels(testVal,:);
trainData = data(testVal,:);
testData=data(testVal,:);
testLabel=labels(testVal,:);
numTrain = 8; numTest =8
%# train one-against-all models
model = cell(numLabels,1);
for k=1:numLabels
model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -t 2 -g 0.2 -b 1');
end
%# get probability estimates of test instances using each model
prob = zeros(numTest,numLabels);
for k=1:numLabels
[~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
prob(:,k) = p(:,model{k}.Label==1); %# probability of class==k
end
%# predict the class with the highest probability
[~,pred] = max(prob,[],2);
acc = sum(pred == testLabel) ./ numel(testLabel) %# accuracy
C = confusionmat(testLabel, pred) %# confusion matrix
and this is the results:
optimization finished, #iter = 16
nu = 0.645259 obj = -2.799682,
rho = -0.437644 nSV = 8, nBSV = 1 Total nSV = 8
Accuracy = 100% (8/8) (classification)
acc =
0.3750
C =
0 5
0 3
I dont know why there's two accuracy, and its different. the first one is 100% and the second one is 0.375. is my code false? it should be 100% not 37.5%. Can u help me to correct this code??
If your using libsvm then you should change the name of the MEX file since Matlab already has a svm toolbox with the name svmtrain. However, the code is running so it seems you did change the name just not on the code you provided.
The second one is wrong, don't know exactly why. However, I can tell you that you will almost always get 100% accuracy if you use the test_Data = training_Data. That result really does not mean anything, since the algorithm can be overfit and not be shown in your results. Test your algorithm against new data and that will give you a realistic accuracy.
Is that the code you're using? I don't think your svmtrain invocation is valid. You should have svmtrain(MAT, VECT, ...) where MAT is a matrix of data, and VECT is a vector with the labels of each row of MAT. The remaining parameters are string-value pairs, meaning you'll have a string identifier and its corresponding valie.
When I ran your code (Linux, R2011a) I got an error on the svmtrain call. Running with svmtrain(trainData, double(trainLabel==k)) gave a valid output (for that line). Of course, it appears that you're not using pure matlab, as your svmpredict call isn't native matlab, but rather a matlab binding from LIBSVM...
C = confusionmat(testLabel, pred)
swap their positions
C= confusionmat(pred,testLabel)
or use this
[ConMat,order] = confusionmat(pred,testLabel)
shows the confusion matrix and the class order
The problem is in
[~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
p does not contain the predicted labels, it has the probability estimates of the labels being correct. LIBSVM's svmpredict already calculates accuracy for you correctly, that's why it says 100% in the debug output.
The fix is simple:
[p,~,~] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
According to LIBSVM's Matlab bindings README:
The function 'svmpredict' has three outputs. The first one,
predictd_label, is a vector of predicted labels. The second output,
accuracy, is a vector including accuracy (for classification), mean
squared error, and squared correlation coefficient (for regression).
The third is a matrix containing decision values or probability
estimates (if '-b 1' is specified). If k is the number of classes
in training data, for decision values, each row includes results of
predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a
special case. Decision value +1 is returned for each testing instance,
instead of an empty vector. For probabilities, each row contains k values
indicating the probability that the testing instance is in each class.
Note that the order of classes here is the same as 'Label' field
in the model structure.
I am sorry to tell that all answers are totally wrong!!
The main error done in the code is:
numLabels = max(labels);
because it returns (1), although it should return 2 if the labels are positive numbers, and then svmtrain/svmpredict will loop twice.
Anyway, change line labels=[-1;-1;-1;-1;-1;1;1;1];
to labels=[2;2;2;2;2;1;1;1];
and it will work successfully ;)