How to check two results are comparable or not using K-S test? - statistical-test

I have used two algorithms(methods) on few datasets and obtained some results. Now I want to check whether the obtained results are comparable or not? I have used two sampled K-S test and got the following result, now how to interpret the test results? and what should be the conclusion?
KstestResult(statistic=0.11320754716981132, pvalue=0.8906908896753045)
D_alpha= 0.2641897545611759

The K-S test is used to compare distributions. So, when you apply it, you are interested in comparing distribution say F(X) with another distribution $G(X)$. Typically, the former is your data distribution and the latter is either another data distribution or something you specify (e.g. Gaussian).
Now, the null hypothesis of the test is that F(X)=G(X). The test produces a statistic to asses whether this null hypothesis can be assumed to be true. You do not typically look directly at this quantity; rather you want to look at the p-value. This is the probability of observing a statistic greater than the one you observed, if the null hypothesis were true. In this case, a low value of the p-value indicates that it is veri unlikely (i.e. low probability) to observe a statistic greater than the one you observed if the hypothesis were true; thus, since you observed a statistic which gives you a small p-value, you tend to believe that your null hypothesis must be false, and that indeed F(X) is different from G(X).
In your case the p-value is high (for low p-value, we intend 0.1, 0.05 or 0.01 or lower), so you do not rejecte the null hypothesis. Thus, you would say that the things that you tested are the same and are not statistically different.
However, I strongly encourage you to read more about the test, how it is used and when; also try to understand if it is appropriate in your case. You can do it on Wikipedia and on the scipy docs

Related

Why "time==0.5" isn't a discrete expression in Modelica language?

I build a simple model to understand the concept of "Discrete expressions", here is the code:
model Trywhen
parameter Real B[ :] = {1.0, 2.0, 3.0};
algorithm
when time>=0.5 then
Modelica.Utilities.Streams.print("message");
end when;
annotation (uses(Modelica(version="3.2.3")));
end Trywhen;
But when checking the model, I got an error showing that "time==0.5" isn't a discrete expression.
If I change time==0.5 to time>=0.5, the model would pass the check.
And if I use if-clause to when-clause, the model works fine but with a warning showing that "Variables of type Real cannot be compared for equality."
My questions are:
Why time==0.5 is NOT a discrete expression?
Why Variables of type Real cannot be compared for equality? It seems common when comparing two variables of type Real.
The first question is not important, since time==0.5 is not allowed.
The second question is the important one:
Comparing reals for equality is common in other languages, and also a common source of errors - unless special care is taken.
Merely using the floating point compare of the processor is a really bad idea on idea on some processors (like Intel) that mix 80-bit and 64-bit floating point numbers (or comes with a performance penalty), and also in other cases it may not work as intended. In this case 0.5 can be represented as a floating point number, but 0.1 and 0.2 cannot.
Often abs(x-y)<eps is a good alternative, but it depends on the intended use and the eps depends on additional factors; not only machine precision but also which algorithm is used to compute x and y and its error propagation.
In Modelica the problems are worse than in many other languages, since tools are allowed to optimize expressions a lot more (including symbolic manipulations) - which makes it even harder to figure out a good value for eps.
All those problems mean that it was decided to not allow comparison for equality - and require something more appropriate.
In particular if you know that you will only approach equality from one direction you can avoid many of the problems. In this case time is increasing, so if it has been >0.5 at an event it will not be <=0.5 at a later event, and when will only trigger the first time the expression becomes true.
Therefore when time>=0.5 will only trigger once, and will trigger about when time==0.5, so it is a good alternative. However, there might be some numerical inaccuracies and thus it might trigger at 0.500000000000001.

Is Matlab incorrect for mnrfit?

It seems Matlab is giving incorrect results for multinomial logistic regression.
In their example documentation using Fisher's Iris dataset [link], they give coefficients for the model which can be used on the same data set itself to get the modeled probabilities.
load fisheriris
sp = categorical(species);
[B,dev,stats] = mnrfit(meas,sp);
PHAT=mnrval(B,meas);
However, none of the expected value aggregates match the population aggregates which is a requirement for a MaxEnt classifer (See slide 35 [here], or Eq 14 [here], or Agresti "Categorical Data Analysis" pg 298, etc.)
For example
>> sum(PHAT)
>> 49.9828 49.8715 50.1456
should all equal 50 (population values), likewise for other aggregations
If the parameters
B=[36.9450 42.6378
12.2641 2.4653
14.4401 6.6809
-30.5885 -9.4294
-39.3232 -18.2862]
were used instead then all aggregated sufficient statistics match.
Additionally it seems odd that Matlab is solving it with likelihoods, which can produce an error,
Warning: Maximum likelihood estimation did not converge. Iteration
limit exceeded. You may need to merge categories to increase observed
counts
where the only requirement, proved by MLE consideration, is that the expected values match and no likelihood evaluation is needed.
It would be a nice feature that if instead of true classes are given we can give an option for including just the aggregate information.
Submitted a technical error review within Mathworks website. Their reply:
Hello [----],
I am writing in reference to your Technical Support Case #01820504
regarding 'mnrfit'.
Thanks a lot for your patience and reporting this issue. This appears
to be unexpected behavior. It appears to be related to an existing
issue we have in our records, that "mnrfit" does not give correct
maximum likelihood estimates in certain cases. Since the "mnrfit"
function is not finding the maximum likelihood estimates for the
coefficients, we calculated the actual MLEs. When we use these
estimates, we get the desired result of all 50s in this case.
The issue is that, for this particular dataset in our example, the
classes can be separated perfectly. This means that the logistic
function, in order to get exact zero or one probabilities, needs to
have infinite coefficients. The "mnrfit" function carries out an
iterative procedure with the coefficients getting larger, but it stops
at a point where the results have the issue that you have found. We
certainly agree that "mnrfit" could be made to do better. Our
development team is working on it.
At this stage, I am not able to suggest a workaround other than to
write a custom implementation as my colleague and I had tried. For
now, I will be closing this request as I have already forwarded it to
our records. However, if you have any additional questions related to
this case, please do not hesitate to reach me.
Sincerely,
[----]
MathWorks Technical Support Department

Negative option prices for certain input values in MATLAB?

In the course of testing an algorithm I computed option prices for random input values using the standard pricing function blsprice implemented in MATLAB's Financial Toolbox.
Surprisingly ( at least for me ) ,
the function seems to return negative option prices for certain combinations of input values.
As an example take the following:
> [Call,Put]=blsprice(67.6201,170.3190,0.0129,0.80,0.1277)
Call =-7.2942e-15
Put = 100.9502
If I change time to expiration to 0.79 or 0.81, the value becomes non-negative as I would expect.
Did anyone of you ever experience something similar and can come up with a short explanation why that happens?
I don't know which version of the Financial Toolbox you are using but for me (TB 2007b) it works fine.
When running:
[Call,Put]=blsprice(67.6201,170.3190,0.0129,0.80,0.1277)
I get the following:
Call = 9.3930e-016
Put = 100.9502
Which is indeed positive
Bit late but I have come across things like this before. The small negative value can be attributed to numerical rounding error and / or truncation error within the routine used to compute the cumulative normal distribution.
As you know computers are not perfect and small numerical error always persists in all calculations, in my view therefore the question one should must ask instead is - what is the accuracy of the input parameters being used and therefore what is the error tolerance for outputs.
The way I thought about it when I encountered it before was that, in finance, typical annual stock price return variance are of the order of 30% which means the mean returns are typically sampled with standard error of roughly 30% / sqrt(N) which is roughly of the order of +/- 1% assuming 2 years worth of data (so N = 260 x 2 = 520, any more data you have the other problem of stationarity assumption). Therefore on that basis the answer you got above could have been interpreted as zero given the error tolerance.
Also we typically work to penny / cent accuracy and again on that basis the answer you had could be interpreted as zero.
Just thought I'd give my 2c hope this is helpful in some ways if you are still checking for answers!

Algorithm generation

I have a rather large(not too large but possibly 50+) set of conditions that must be placed on a set of data(or rather the data should be manipulated to fit the conditions).
For example, Suppose I have the a sequence of binary numbers of length n,
if n = 5 then a element in the data might be {0,1,1,0,0} or {0,0,0,1,1}, etc...
BUT there might be a set of conditions such as
x_3 + x_4 = 2
sum(x_even) <= 2
x_2*x_3 = x_4 mod 2
etc...
Because the conditions are quite complex in that they come from experiment(although they can be written down in logic form) and are hard to diagnose I would like instead to use a large sample set of valid data. i.e., Data I know satisfies the conditions and is a pretty large set. i.e., it is easier to collect the data then it is to deduce the conditions that the data must abide by.
Having said that, basically what I'm doing is very similar to neural networks. The difference is, I would like an actual algorithm, in some sense optimal, in some form of code that I can run instead of the network.
It might not be clear what I'm actually trying to do. What I have is a set of data in some raw format that is unique and unambiguous but not appropriate for my needs(in a sense the amount of data is too large).
I need to map the data into another set that actually is ambiguous to some degree but also has certain specific set of constraints that all the data follows(certain things just cannot happen while others are preferred).
The unique constraints and preferences are hard to figure out. That is, the mapping from the non-ambiguous set to the ambiguous set is hard to describe(which is why it is ambiguous). The goal, actually, is to have an unambiguous map by supplying the right constraints if at all possible.
So, on the vein of my initial example, I'm given(or supply) a set of elements and need some way to derive a list of constraints similar to what I've listed.
In a sense, I simply have a set of valid data and train it very similar to neural networks.
Then, after this "Training" I'm given the mapping function I can then use on any element in my dataset and it will produce a new element satisfying the constraint's, or if it can't, will give as close as possible an unambiguous result.
The main difference between neural networks and what I'm trying to achieve is I'd like to be able to use have an algorithm to code to be used instead of having to run a neural network. The difference here is the algorithm would probably be a lot less complex, not need potential retraining, and a lot faster.
Here is a simple example.
Suppose my "training set" are the binary sequences and mappings
01000 => 10000
00001 => 00010
01010 => 10100
00111 => 01110
then from the "Magical Algorithm Finder"(tm) I would get a mapping out like
f(x) = x rol 1 (rol = rotate left)
or whatever way one would want to express it.
Then I could simply apply f(x) to any other element, such as x = 011100 and could apply f to generate a hopefully unambiguous output.
Of course there are many such functions that will work on this example but the goal is to supply enough of the dataset to narrow it down to hopefully a few functions that make the most sense(at the very least will always map the training set correctly).
In my specific case I could easily convert my problem into mapping the set of binary digits of length m to the set of base B digits of length n. The constraints prevents some numbers from having an inverse. e.g., the mapping is injective but not surjective.
My algorithm could be a simple collection if statements acting on the digits if need be.
I think what you are looking for here is an application of Learning Classifier Systems, LCS -wiki. There are actually quite a few LCS open-source applications available, but you may need to experiment with the parameters in order to get a good result.
LCS/XCS/ZCS have the features that you are looking for including individual rules that could be heavily optimized, pressure to reduce the rule-set, and of course a human-readable/understandable set of rules. (Unlike a neural-net)

How to decide if a user's anonymous operator is linear

If I'm getting a anonymous operator from a user I would like to test (extremely quickly) if the operator is linear. Is there a standard way to do this? I have no way of doing symbolic operators, or parsing the function. Is the only way trying some random functions (what random functions do I choose) and seeing if they satisfy linearity??
Thanks in advance.
Context:
User supplies a black box operator, that is a function which takes functions to functions.
I can give the operator a function and I get a function back. I want to determine if the operator is linear? Is there a standard fast method which gives me high confidence?
No, not without sweeping the entire parameter space. Imagine the following:
#(f) #(x) f(x) + (x == 1e6)
This operator is non-linear, but there's no way of knowing that unless you happen to test at x == 1e6.
UPDATE
As others have suggested, it may be "good enough" to determine a domain of interest, and check for linearity at periodic intervals across the domain. This may give you false positives (i.e. suggest an operator is linear when in fact it's non-linear), but never false negatives.
This is information the user should supply. Add a parameter linear true/false, default to false (I'm assuming that the code for non-linear will work for linear too, just taking more time).
The problem with random testing is that you will classify a non-linear function as linear sooner or later, and then the user has a problem, because your function unpredictably produces wrong results (depending on which points you pick randomly), that may be reasonably close to the correct results, ie people may not notice for a loooong time --> this is a recipe for disaster.
Really, the user should know this in the first place, its very important to avoid false positives and as said before there is no completely reliable way to test this. Save yourself the trouble and add an additional parameter.