Any reason why these instance could be misclassified? - classification

I started off with two files training & testing.
Then using libsvm I scaled both those files to training.scale and testing.scale
Then using grid.py (part of libsvm) I ran training.scale and and recieved some cross validation values:
C = 512
gamme = 0.03125
validation 5 = 66.8421
Then running svm-train using the variable found from grid.py and training.scale I got a new fine called training.scale.model
I then ran svm-predict and I new file called testing.predict and got a validation % of 60.8333%
Finally comparing testing and testing.predict found that there were 47/120 misclassifications
[https://drive.google.com/folderview?id=0BxzgP5V6RPQHekRjZXdFYW9GX0U&usp=sharing][1]
[1]: link to code
The real question is there any reason why these misclassification occur?
PS. I apologise for the bad format of this question, been up for too long

I am guessing you are new to machine learning. The results you've got are perfectly right.
Reason why these mis-classifications occur? The features you've used don't separate your classes well. A 66% cross-validation score should have given you the hint. Even by plain hit or miss method you'll get 50% accuracy, and the feature-set you used could only improve this by another 16%. Try exploring new features.
I'm assuming your data set is clean.

Related

How to use "Easy edge trace" and "edge trace distances" in ImageJ?

I have already installed the both plugins but don't know how to use them for pod analysis. Need help in that as i don't have programming background. Also can we use it for batch processing of images, in case i have more than 100 images?
Another approach per specially coded ImageJ-macro gives reasonable estimates of the widths and lengths of all pods in the sample image. You can access the macro code from here. Unzip the zip-archive and drop the file "plantPodDimensions.ijm" onto the ImageJ main window. Then open the sample image and run the macro. The estimated pod dimensions appear in a table.
Specimen [right to left] Mean Pod Width [cm] Pod Length [cm]
OHiI7_pod-1 0.70±0.11 23.6
OHiI7_pod-2 0.59±0.09 22.3
OHiI7_pod-3 0.64±0.05 20.7
OHiI7_pod-4 0.41±0.04 20.5
OHiI7_pod-5 0.66±0.07 22.9
OHiI7_pod-6 0.68±0.10 24.4
OHiI7_pod-7 0.60±0.07 20.5
Of course it couldn't be tested, if the macro works as expected for other images than the sample image.
With the download of the plugins come documents that explain the use of the plugins. Batch processing is possible, if the starting points of the traces are known or can somehow be determined by additional pre-processing steps (not trivial). Both plugins are macro-recordable. In any case, batch processing will require some macro code.
For the use case in question I would recommend to perform the analyses via the GUI, not per batch processing. The coding of a suitable macro would take more time than the processing of 100+ images.

EF6/Code First: Super slow during the 1st query, but only in Debug

I'm using EF6 rc1 with Code First strategy, without precompiled views and the problem is:
If I compile and run the exe application it takes like 15 seconds to run the first query (that's okay, since I'm still working on the pre-generated views). But if I use Visual Studio 2013 Preview to Debug the exact same application it takes almost 2 minutes BEFORE running the first query:
Dim Context = New MyEntities()
Dim Query = From I in Context.Itens '' <--- The debug takes 2 minutes in here
Dim Item = Query.FirstOrDefault()
Is there a way to remove this extra time? Am I doing something wrong here?
Ps.: The context itself is not complicated, its just full with 200+ tables.
Edit: Found out that the problem is that during debug time the EF appears to be generating the Views ignoring the pre-generated ones.
Using the source code from EF I discovered that the property:
IQueryProvider IQueryable.Provider
{
get
{
return _provider ?? (_provider = new DbQueryProvider(
GetInternalQueryWithCheck("IQueryable.Provider").InternalContext,
GetInternalQueryWithCheck("IQueryable.Provider").ObjectQueryProvider));
}
}
is where the time is being consumed. But this is strange since it only takes time in debug. Am I missing something here?
Edit: Found more info related to the question:
Using the Process Monitor (by Sysinternals) I found out that there its the 'desenv.exe' process that is consuming tons of time. To be more specific its consuming time with an 'Thread Exit'. It repeats the Thread Exit stack 36 times. I don't know if this info is very useful, but I saved a '.cvs' with the stack, here is his body: [...] (edit: removed the '.cvs' body, I can post it again by the comments if someone really think its going to be useful, but it was confusing and too big.)
Edit: Installed VS2013 Ultimate and Entity Framework 6 RTM. Installed the Entity Framework Power Tools Beta 4 and used it to generate the Views. Nothing changed... If I run the exe it takes 20 seconds, if I 'Start' debugging it takes 120 seconds.
Edit: Created a small project to simulate the error: http://sdrv.ms/16pH9Vm
Just run the project inside the environment and directly through the .exe, click the button and compare the loading time.
This is a known performance issue in Lazy (which EF is using) when the debugger is attached. We are currently working on a fix (the current approach we are looking at is removing the use of Lazy). We hope to ship this fix in a patch release soon. You can track progress of this issue on our CodePlex site - http://entityframework.codeplex.com/workitem/1778.
More details on the coming 6.0.2 patch release that will include a fix are here - http://blogs.msdn.com/b/adonet/archive/2013/10/31/ef6-performance-issues.aspx
I don't know if you have found the solution. But in my case, I had similar issue which wasted me close to a week after trying different suggestions. Finally, I found a solution by changing my web.config to optimizeCompilations="true" and performance improved dramatically from 15-30 seconds to about 2 seconds.

How many iterations are saved by JAGS/BUGS when burnin and thinning are specified?

I have a quick question about the details of running a model in JAGS and BUGS.
Say I run a model with n.burnin=5000, n.iter=5000 and thin=2. Does this mean that the program will:
Run 5,000 iterations, and discard results; and then
Run another 10,000 iterations, only keeping every second result?
If I save these simulations as a CODA object, are all 10,000 saved, or only the thinned 5,000? I'm just trying to understand which set of iterations are used to make the ACF plot?
With JAGS, n.burnin=5000, n.iter=5000 and thin=2, means you keep nothing. You run 5000, discard the first 5000 of these 5000 and then only keep a half of the remaining values of the chain (keep 1 value and discard the next one ..).
Use for example n.burnin=2000, n.iter=7000, thin=50, n.chains=5 : so you have (7000-2000)/50 * 5 = 500 values.
Could you be more specific which software you're talking about? It looks like you're referring to the arguments of the function bugs() in the R2WinBUGS package (except that the argument is called n.thin not thin). Looking at help(bugs) it just says n.burnin is the "number of iterations to discard at the beginning". Which doesn't specifically answer your question, but looking at the source for bugs.script() in that package suggests to me that it would run 5000 iterations burn in, as you suspected. You could send a suggestion to the maintainers of that package to clarify their documentation.
In your example, bugs() would then run 0 further iterations after the burn-in. Here the documentation is clearer - n.iter is the total number of iterations including the burn-in.
For your second question, the CODA output from WinBUGS (and any software which calls WinBUGS or OpenBUGS) will only include the thinned sample.

Texture feature extraction using Gray Level Cooccurence Matrix

I'm doing a project in liver tumor classification. I used this code and it gave some output. I don't know whether I'm correct.
Actually I initially used Region Growing method for liver segmentation and from that I segmented tumor using FCM. So, to this GLCM program, I gave the tumor segmented image as input. Was I correct? If so, I think, then, my output will also be correct.
I gave the parameters exactly as in the example. Actually what do they mean? Do I need to change them for different images? If so, how to give the parameters? I'm completely new to this. So, kindly guide me.
I got this output. Am I correct?
stats =
autoc: [1.857855266614132e+000 1.857955341199538e+000]
contr: [5.103143332457753e-002 5.030548650257343e-002]
corrm: [9.512661919561399e-001 9.519459060378332e-001]
corrp: [9.512661919561385e-001 9.519459060378338e-001]
cprom: [7.885631654779597e+001 7.905268525471267e+001]
cshad: [1.219440700252286e+001 1.220659371449108e+001]
dissi: [2.037387269065756e-002 1.935418927908687e-002]
energ: [8.987753042491253e-001 8.988459843719526e-001]
entro: [2.759187341212805e-001 2.743152140681436e-001]
homom: [9.930016927881388e-001 9.935307908219834e-001]
homop: [9.925660617240367e-001 9.930960070222014e-001]
maxpr: [9.474275457490587e-001 9.474466930429607e-001]
sosvh: [1.847174384255155e+000 1.846913030238459e+000]
savgh: [2.332207337361002e+000 2.332108469591401e+000]
svarh: [6.311174784234007e+000 6.314794324825067e+000]
senth: [2.663144677055123e-001 2.653725436772341e-001]
dvarh: [5.103143332457753e-002 5.030548650257344e-002]
denth: [7.573115918713391e-002 7.073380266499811e-002]
inf1h: [-8.199645492654247e-001 -8.265514568489666e-001]
inf2h: [5.643539051044213e-001 5.661543271625117e-001]
indnc: [9.980238521073823e-001 9.981394883569174e-001]
idmnc: [9.993275086521848e-001 9.993404634013308e-001]
Kindly guide me. Thank you
its ok but i don't think we usually need all this extra information i usually prefer to use the following code
GLCM2 = graycomatrix(img,'Offset',[1 1]);
stats = graycoprops(GLCM2);
i hope it will help you

iRobot Create not returning sensor data

I am trying to stream sensor data from the iRobot Create. I get tuple out of range errors when I try
bot.stream_sensors(somenumber) and bot.poll_sensors(somenumbers). Whenever I input bot.sensors, I just get an empty array {}. I have even tried sending bot.sensors while pushing in on the bump sensor, still getting an empty array. I am connected to the bot through the Serial port with a serial-to-usb converter on my side. The only code before trying to get the sensor data is
import openinterface
bot = openinterface.CreateBot(com_port="/dev/ttyUSB0", mode="full")
Does anyone have an idea of how to solve this issue? Everywhere else just uses stream_sensors(6) and it seems to work fine.
P.S. I posted a question similar to this topic not too long ago, but no one responded. Not trying to spam, but now I have a more clear question and what the apparent-problem is so I thought I would try again.
I downloaded openinterface.py from this site: which included some sample programs. I'd suggest you take a step back, try the sample code, try to find other, more sophisticated, sample code and play with that first before moving on to your real code. You may be missing a step somewhere.
I may be a bit late to answer this, but for reference purposes. Directly controlling the iRobot is simplified greatly by using
Pyrobot.