Stacking Filters Weka Explorer - classification

Hi I'm new to Weka and using the explorer to try to do some text classification.
I have a training set which I have tested using the "word to string vector filter" and an "attribute selection" filter. However I want to be able to test the classifier on unseen data and so have tried using the "supplied test set option". After reading around I realise that the word to string vector filter has to be applied at the same time to both sets so I have used the "Filtered Classifier" option and proceeded to do this. However I cannot seem to apply the Attribute Selection filter as well??
If I am going about this the wrong way please let me know? Or if there is an option to apply or stack multiple filters when classifying that'd be great. Cheers

You have to chain filters (StringToWordVector and AttributeSelection) using MultiFilter, they behave as a single filter you can put into a FilteredClassifier. Check a detailed tutorial at Text Mining in WEKA Revisited: Selecting Attributes by Chaining Filters.

Related

PyQGIS: change expression for datadefined symbology

I have a QGIS project that displays calculation results on a map. There are several vector layers, each with >100 calculated fields. The way the data should be visualised is very similar for all these layers/fields. I try to write a script that duplicates a template layer, and changes the expressions for the symbology according to the selected fieldname.
Below is a screenshot of the properties I try to access (if I were to change them using the UI).
How do I access/change the expressions of the line width and line offset of a graduated symbology in PyQGIS?
In case someone else runs into this issue.
I solved it using a workaround. I saved the style of the source layer to a qml-file, for the duplicated layers I create a temporary copy of the qml-file, do a search-and-replace on the qml-file and apply this to the new layer using
newLayer.loadNamedStyle(pathToTheTempQmlFile)
Hope this helps:
rule=layer.renderer().rootRule().children()[0]
rule.setFilterExpression('whatever')
see:
QgsRuleBasedRenderer.Rule

Visio ShapeSheet ShapeData: keep two rows in sync

I have two Shape Data rows for a Shape's ShapeSheet:
Shape Data Label Prompt Type Format Value Invisible
Prop.Type "Type" "" 4 "Alpha;Beta;Gamma;Delta;Epsilon;Zeta;Eta;Theta;Iota;Kappa" INDEX(4,Prop.Type.Format) False
Prop.Abbrev "Abbrev" No Formula 4 "A;B;G;D;E;Z;E;T;I;K" INDEX(4,Prop.Abbrev.Format) True
The way I intent to use this is to have the user select the Type, say Epsilon, and then have the Abbrev automatically switch to the corresponding value in the Prop.Abbrev.Format.
Note: the values used here are placeholders for the actual values for my application, which are not shown here so they don't distract from the real answer I need, how to keep the selections in sync when the first one is chosen or changed.
Thanks for any help you can give!
I don't have Visio on this machine, so I am unable to copy and paste a working solution. The approach gets a little complicated, but extremely flexible.
Save your lists in the User section, rather than Prop - this then becomes underlying data for use in properties. If you are using a master stencil then this also helps with managing the fields.
You can now store an index in your data as well - this index points to the appropriate values in your arrays. You can use Actions and side menus to set the index which, when referenced properly, means you can have the full name and/or abbreviation in the side menu and the ShapeSheet does all the work underneath.
The functions you want to look at are:
Index (e.g. INDEX(1,User.Type) will return "Beta". (0-based)
Lookup (e.g. LOOKUP("D", User.Abbrev) will return "3". (0-based)
GetAtRef
SetAtRef
SetAtRefExpr
SetF
I had a similar business problem which relied on setting a background colour based on the value of shape data. Your final solution could end up including formulas like this: =SETF(GetRef(Prop.Type),"GUARD(INDEX(LOOKUP(Prop.X,Prop.X.Format),User.Type))").
For more in-depth discussion - check out https://superuser.com/questions/1277331/fillforegnd-in-shapesheet-using-wrong-data and the extended discussion at http://visguy.com/vgforum/index.php?topic=8205.15 - the latter link also includes an example file with working shapesheets (well, working to the extent that they exposed my problem).

Only filter, no input text or number

In the demo a function is shown that can not be found in the documentation.
How to make?
[x] support request
Only filter, no input text or number
That's the set filter with floating filters turned on.
It's in the documentation, but maybe it's a bit hard to find:
https://www.ag-grid.com/javascript-grid-filter-set/
https://www.ag-grid.com/javascript-grid-floating-filter-component/
The set filter is an enterprise feature.
The code for that demo grid is here, CTRL+F for "countryCellRenderer" to see the relevant parts:
https://www.ag-grid.com/example.js

How can I process Persian texts using Rapid Miner?

I am working on a persian classification project. Persian texts is very similar to arabic texts. when I use Tokenize, it does not show any word in its wordlist page and in Example Set Page, The Image below will be shown:
I need to classify persian text to some category, but I dont know How?.
I Follow some steps like this:
1- Read Excel(using Read Excel component) dataset with 2 column => col1:persian Text ,col2: Category
2- I use Set role component to labeling data
3- I use Process Documents from Data component containing :(Tokenize(with any mode not change anythings) and Filter Token(min:5,max:25) inside it)
4- Then I use Cross Validation Component to train with SVM or Basian and in test mode to get performance.
The program runs correctly and performance is not bad for e.g accuracy is 50% but I think my work is Wrong.
Any help would be appreciated.
first, make sure your text data have UTF-8 encoding
and if u use filter tokens(by length) 5 is too much for minimum try 2 or at least 3
also, I recommend using Filter Stopwords (Dictionary) operator and the dictionary should have Persian stopwords in each line
hope it will help u

I want to add a new Classification to the Coverity webapp

Right now the Coverity web api has the following options for the "Classification" field for each finding:
Unclassified
Bug
Pending
False Positive
Intentional
I'd like to add "Bad Practice" to the list. Is this possible?
No, you can't add element to the Classification list in Coverity.
You can add to the Action and Severity lists though. So these might fill your need.