I need to classify objects using fuzzy logic. Each object is characterized by 4 features - {size, shape, color, texture}. Each feature is fuzzified by linguistic terms and some membership function. The problem is I am unable to understand how to defuzzify such that I may know which class an unknown object belongs to. Using the Mamdani Max-Min inference, can somebody help in solving this issue?
Objects = {Dustbin, Can, Bottle, Cup} or denoted as {1,2,3,4} respectively. The fuzzy sets for each feature is :
Feature : Size
$\tilde{Size_{Large}}$ = {1//1,1/2,0/3,0.6/4} for crisp values in range 10cm - 20 cm
$\tilde{Size_{Small}}$ = {0/1,0/2,1/3,0.4/4} (4cm - 10cm)
Shape:
$\tilde{Shape_{Square}}$ = {0.9/1, 0/2,0/3,0/4} for crisp values in range 50-100
$\tilde{Shape_{Cylindrical}}$ = {0.1/1, 1/2,1/3,1/4} (10-40)
Feature : Color
$\tilde{Color_{Reddish}}$ = {0/1, 0.8/2, 0.6/3,0.3/4} say red values in between 10-50 (not sure, assuming)
$\tilde{Color_{Greenish}}$ = {1/1, 0.2/2, 0.4/3, 0.7/4} say color values in 100-200
Feature : Texture
$\tilde{Tex_{Coarse}}$ = {0.2/1, 0.2/2,0/3,0.5/4} if texture crisp values 10-20
$\tilde{Tex_{Shiny}}$ = {0.8/1, 0.8/2, 1/3, 0.5/4} 30-40
The If then else rules for classification are
R1: IF object is large in size AND cylindrical shape AND greenish in color AND coarse in texture THEN object is Dustbin
or in tabular form just to save space
Object type Size Shape Color Texture
Dustbin : Large cylindrical greenish coarse
Can : small cylindrical reddish shiny
Bottle: small cylindrical reddish shiny
Cup : small cylindrical greenish shiny
Then, there is an unknown feature with crisp values X = {12cm, 52,120,11}. How do I classify it? Or is my understanding incorrect, that I need to reformulate the entire thing?
Fuzzy logic means that every pattern belongs to a class up to a level. In other words, the output of the algorithm for every pattern could be a vector of let's say percentages of similarity to each class that sum up to unity. Then the decision for a class could be taken by checking a threshold. This means that the purpose of fuzzy logic is to quantify the uncertainty. If you need a decision for your case, a simple minimum distance classifier or a majority vote should be enough. Otherwise, define again your problem by taking the "number factor" into consideration.
One possible approach could be to define centroids for each feature's distinct attribute, for example, Large_size=15cm and Small_size=7cm. The membership function could be then defined as a function of the distance from these centroids. Then you could do the following:
1) Calculate the euclidean difference * a Gaussian or Butterworth kernel (in order to capture the range around the centroid) for every feature. Prepare a kernel for every class, for example, dustbin as a target needs large size, coarse texture etc.
2) Calculate the product of all the above (this is a Naive Bayes approach). Fuzzy logic ends here.
3) Then, you could assign the pattern to the class with the highest value of the membership function.
Sorry for taking too long to answer, hope this will help.
Related
If I calculate the length of the shortest path using networkx as:
path_length = nx.shortest_path_length(G, source = origin, target = destination, weight = 'distance')
How does networkx know to interpret the edge attribute as a distance or a weight?
The documentation says either is acceptable but doesn't specify how the attribute will be interpreted.
In the case of a weight, I would expect high values to be preferred. The shortest path traveling through the edges of the highest weights.
In the case of a distance, I would expect lower values to be preferred, to minimize total distance.
Am I missing something conceptually?
The results I've gotten are consistent with my expectations for distances but it's uncomfortable that I can't find anything in the docs that clarifies this.
From the docs:
weight (None or string, optional (default = None)) – If None, every
edge has weight/distance/cost 1. If a string, use this edge attribute
as the edge weight. Any edge attribute not present defaults to 1.
So Whether it is a distance or weight the objective is minimization. Usually words like profit/utility refer to maximization, while weight/distance/cost to minimization, some others like fitness may be used in both cases.
I have built a pretty basic naive bayes over apache spark and using mllib of course. But I have a few clarifications on what exactly neutrality means.
From what I understand, in a given dataset there are pre-labeled sentences which comprise of the necessary classes, let's take 3 for example below.
0-> Negative sentiment
1-> Positive sentiment
2-> Neutral sentiment
This neutral is pre-labeled in the training set itself.
Is there any other form of neutrality handling. Suppose if there are no neutral sentences available in the dataset then is it possible that I can calculate it from the scale of probability like
0.0 - 0.4 => Negative
0.4- - 0.6 => Neutral
0.6 - 1.0 => Positive
Is such kind of mapping possible in spark. I searched around but could not find any. The NaiveBayesModel class in the RDD API has a predict method which just returns a double that is mapped according to the training set i.e if only 0,1 is there it will return only 0,1 and not in a scaled manner such as 0.0 - 1.0 as above.
Any pointers/advice on this would be incredibly helpful.
Edit - 1
Sample code
//Performs tokenization,pos tagging and then lemmatization
//Returns a array of string
val tokenizedString = Util.tokenizeData(text)
val hashingTF = new HashingTF()
//Returns a double
//According to the training set 1.0 => Positive, 0.0 => Negative
val status = model.predict(hashingTF.transform(tokenizedString.toSeq))
if(status == 1.0) "Positive" else "Negative"
Sample dataset content
1,Awesome movie
0,This movie sucks
Of course the original dataset contains more longer sentences, but this should be enough for explanations I guess
Using the above code I am calculating. My question is the same
1) Neutrality handling in dataset
In the above dataset if I am adding another category such as
2,This movie can be enjoyed by kids
For arguments sake, lets assume that it is a neutral review, then the model.predict method will give either 1.0,0.0,2.0 based on the passed in sentence.
2) Using the model.predictProbabilities it gives an array of doubles, but I am not sure in what order it gives the result i.e index 0 is for negative or for positive? With three features i.e Negative,Positive,Neutral then in what order will that method return the predictions?
It would have been helpful to have the code that builds the model (for your example to work, the 0.0 from the dataset must be converted to 0.0 as a Double in the model, either after indexing it with a StringIndexer stage, or if you converted that from the file), but assuming that this code works:
val status = model.predict(hashingTF.transform(tokenizedString.toSeq))
if(status == 1.0) "Positive" else "Negative"
Then yes, it means the probabilities at index 0 is that of negative and at 1 that of positive (it's a bit strange and there must be a reason, but everything is a double in ML, even feature and category indexes). If you have something like this in your code:
val labelIndexer = new StringIndexer()
.setInputCol("sentiment")
.setOutputCol("indexedsentiment")
.fit(trainingData)
Then you can use labelIndexer.labels to identify the labels (probability at index 0 is for labelIndexer.labels at index 0.
Now regarding your other questions.
Neutrality can mean two different things. Type 1: a review contains as much positive and negative words Type 2: there is (almost) no sentiment expressed.
A Neutral category can be very helpful if you want to manage Type 2. If that is the case, you need neutral examples in your dataset. Naive Bayes is not a good classifier to apply thresholding on the probabilities in order to determine Type 2 neutrality.
Option 1: Build a dataset (if you think you will have to deal with a lot of Type 2 neutral texts). The good news is, building a neutral dataset is not too difficult. For instance you can pick random texts that are not movie reviews and assume they are neutral. It would be even better if you could pick content that is closely related to movies (but neutral), like a dataset of movie synopsis. You could then create a multi-class Naive Bayes classifier (between neutral, positive and negative) or a hierarchical classifier (first step is a binary classifier that determines whether a text is a movie review or not, second step to determine the overall sentiment).
Option 2 (can be used to deal with both Type 1 and 2). As I said, Naive Bayes is not very great to deal with thresholds on the probabilities, but you can try that. Without a dataset though, it will be difficult to determine the thresholds to use. Another approach is to identify the number of words or stems that have a significant polarity. One quick and dirty way to achieve that is to query your classifier with each individual word and count the number of times it returns "positive" with a probability significantly higher than the negative class (discard if the probabilities are too close to each other, for instance within 25% - a bit of experimentations will be needed here). At the end, you may end up with say 20 positive words vs 15 negative ones and determine it is neutral because it is balanced or if you have 0 positive and 1 negative, return neutral because the count of polarized words is too low.
Good luck and hope this helped.
I am not sure if I understand the problem but:
prior in Naive Bayes is computed from the data and cannot be set manually.
in MLLib you can use predictProbabilities to obtain class probabilities.
in ML you can use setThresholds to set prediction threshold for each class.
I am trying to train a random forest classificator on a very imbalanced dataset with 2 classes (benign-malign).
I have seen and followed the code from a previous question (How to set up and use sample weight in the Orange python package?) and tried to set various higher weights to the minority class data instances, but the classificators that I get work exactly the same.
My code:
data = Orange.data.Table(filename)
st = Orange.classification.tree.SimpleTreeLearner(min_instances=3)
forest = Orange.ensemble.forest.RandomForestLearner(learner=st, trees=40, name="forest")
weight = Orange.feature.Continuous("weight")
weight_id = -10
data.domain.add_meta(weight_id, weight)
data.add_meta_attribute(weight, 1.0)
for inst in data:
if inst[data.domain.class_var]=='malign':
inst[weight]=100
classifier = forest(data, weight_id)
Am I missing something?
Simple tree learner is simple: it's optimized for speed and does not support weights. I guess learning algorithms in Orange that do not support weight should raise an exception if the weight argument is specified.
If you need them just to change the class distribution, multiply data instances instead. Create a new data table and add 100 copies of each instance of malignant tumor.
To apply the combination of SVD perturbation:
I = imread('image.jpg');
Ibw = single(im2double(I));
[U S V] = svd(Ibw);
% calculate derviced image
P = U * power(S, i) * V'; % where i is between 1 and 2
%To compute the combined image of SVD perturbations:
J = (single(I) + (alpha*P))/(1+alpha); % where alpha is between 0 and 1
I applied this method to a specific face recognition model and I noticed the accuracy was highly increased!! So it is very efficient!. Interestingly, I used the value i=3/4 and alpha=0.25 according to a paper that was published in a journal in 2012 in which the authors used i=3/4 and alpha=0.25. But I didn't make attention that i must be between 1 and 2! (I don't know if the authors make an error of dictation or they in fact used the value 3/4). So I tried to change the value of i to a value greater than 1, the accuracy decreased!!. So can I use the value 3/4 ? If yes, how can I argument therefore my approach?
The paper that I read is entitled "Enhanced SVD based face recognition". In page 3, they used the value i=3/4.
(http://www.oalib.com/paper/2050079)
Kindly I need your help and opinions. Any help will be very appreciated!
The idea to have the value between one and two is to magnify the singular values to make them invariant to illumination changes.
Refer to this paper: A New Face Recognition Method based on SVD Perturbation for Single Example Image per Person: Daoqiang Zhang,Songcan Chen,and Zhi-Hua Zhou
Note that when n equals to 1, the derived image P is equivalent to the original image I . If we
choose n>1, then the singular values satisfying s_i > 1 will be magnified. Thus the reconstructed
image P emphasizes the contribution of the large singular values, while restraining that of the
small ones. So by integrating P into I , we get a combined image J which keeps the main
information of the original image and is expected to work better against minor changes of
expression, illumination and occlusions.
My take:
When you scale the singular values in the exponent, you are basically introducing a non-linearity, so its possible that for a specific dataset, scaling down the singular values may be beneficial. Its like adjusting the gamma correction factor in a monitor.
The objective is to see if two images, which have one object captured in each image, matches.
The object or image I have stored. This will be used as a baseline:
item1 (This is being matched in the code)
The object/image that needs to matched with-this is stored:
input (Need to see if this matches with what is stored
My method:
Covert images to gray-scale.
Extract SURF interest points.
Obtain features.
Match features.
Get 50 strongest features.
Match the number of strongest features with each image.
Take the ratio of- number of features matched/ number of strongest
features (which is 50).
If I have two images of the same object (two images taken separately on a camera), ideally the ratio should be near 1 or near 100%.
However this is not the case, the best ratio I am getting is near 0.5 or even worse, 0.3.
I am aware the SURF detectors and features can be used in neural networks, or using a statistics based approach. I believe I have approached the statistics based approach to some extent by using 50 of the strongest features.
Is there something I am missing? What do I add onto this or how do I improve it? Please provide me a point to start from.
%Clearing the workspace and all variables
clc;
clear;
%ITEM 1
item1 = imread('Loreal.jpg');%Retrieve order 1 and digitize it.
item1Grey = rgb2gray(item1);%convert to grayscale, 2 dimensional matrix
item1KP = detectSURFFeatures(item1Grey,'MetricThreshold',600);%get SURF dectectors or interest points
strong1 = item1KP.selectStrongest(50);
[item1Features, item1Points] = extractFeatures(item1Grey, strong1,'SURFSize',128); % using SURFSize of 128
%INPUT : Aquire Image
input= imread('MakeUp1.jpg');%Retrieve input and digitize it.
inputGrey = rgb2gray(input);%convert to grayscale, 2 dimensional matrix
inputKP = detectSURFFeatures(inputGrey,'MetricThreshold',600);%get SURF dectectors or interest
strongInput = inputKP.selectStrongest(50);
[inputFeatures, inputPoints] = extractFeatures(inputGrey, strongInput,'SURFSize',128); % using SURFSize of 128
pairs = matchFeatures(item1Features, inputFeatures, 'MaxRatio',1); %matching SURF Features
totalFeatures = length(item1Features); %baseline number of features
numPairs = length(pairs); %the number of pairs
percentage = numPairs/50;
if percentage >= 0.49
disp('We have this');
else
disp('We do not have this');
disp(percentage);
end
The baseline image
The input image
I would try not doing selectStrongest and not setting MaxRatio. Just call matchFeatures with the default options and compare the number of resulting matches.
The default behavior of matchFeatures is to use the ratio test to exclude ambiguous matches. So the number of matches it returns may be a good indicator of the presence or absence of the object in the scene.
If you want to try something more sophisticated, take a look at this example.