I'm studying about hash function with wikipedia and here is the explanation of testing and measurement of hash function's uniformity.
here is the link
The chi-square test is mentioned in the description, but I do not understand how this formula relates to the chi-square test and how it was derived.
what is the relation of formula and chi-square test?
Related
I have a question regarding the results we are obtaining from ODE solvers. I will try my best to briefly explain the question I have with me. For an example if we ran a simulation with ANSYS or may be any other FEA package, before we conclude our results there are many parameters to check the quality of the final results we obtained.
But in a numerical simulation, we are the one who gives relTol,absTol and other parameter values to improve the accuracy of the calculation to the solver. For an example if we select solve_ivp which is highly customisable solver available with SciPy.
Q1).How exactly make sure, the results of the solver is acceptable ?.
Q2). What are the ways we can check the quality of the final results we obtained?, before we make a conclusion based on the results obtained.
Q3) How further improve the accuracy of the by changing solver options?.
Highly appreciate if you can share your ideas with sample codings.
IMO, Q1 and Q2 are the same question. The reliability of the results will depend on the accuracy of the mathematical model wrt to the simulated phenomenon (f.i. assuming linearity when linearity is questionable) and the precision of the algorithm. You need to check if the method converges, and if it converges, must converge to a correct solution.
Ideally, you should compare your results to "ground truth" on typical problems. Ground truth can be obtained from a lab experiment, or by using an alternative method known to yield correct results. Without this, you will never be sure that your numerical method is valid, other than by an act of faith.
To understand the effect of the parameters and address Q3, you can solve the same problem with different parameter settings and observe their effect, one by one. After a while, you should get a better understanding on the convergence properties in relation to the parameter settings.
I am looking for algorithms that assign weights to some variables based on an outcome. You have a response variable Y, let's say the sales generated by a customer and some explanatory variables related to each customer. I want to attribute a value/weight to each of these variables.
I started with fitting a linear regression algorithm but R squared was not so attractive. Is there any suggestions concerning some other models doing the same thing but maybe with more precision?
For example for Linear regression the beta values are the weights I am looking for.
I'm having a bit of trouble following the explanation of the parameters for vgxset. Being new to the field of time-series is probably part of my problem.
The vgxset help page (http://www.mathworks.com/help/econ/vgxset.html) says that its for a generalized model structure, VARMAX, and I assume that I just use a portion of that for VARMA. I basically tried to figure out what parameters pertain to VARMA versus, as opposed to the additional parameters for VARMAX. I assumed (maybe wrongly) nX and b pertain to the exogenous variables. Unfortunatley, I haven't found much on the internet about the prevailing notational conventions for a VARMAX model, so it's hard to be sure.
The SAS page for VARMAX (http://support.sas.com/documentation/cdl/en/etsug/67525/HTML/default/viewer.htm#etsug_varmax_details02.htm) shows that if you have "r" exogenous inputs and k time series, and if you look back at "s" time steps' worth of exogenous inputs, then you need "s" matrices of coefficients, each (k)x(r) in size.
This doesn't seem to be consistent with the vgxset page, which simply provides an nX-vector "b" of regression parameters. So my assumption that nX and b pertain to the exogenous inputs seems wrong, yet I'm not sure what else they can refer to in a VARMAX model. Furthermore, in all 3 examples given, nX seems to be set to the 3rd argument "s" in VARMAX(p,q,s). Again, though, it's not entirely clear because in all the examples, p=s=2.
Would someone be so kind as to shed some light on VARMAX parameters "b" and "nX"?
On Saturday, May 16, 2015 at 6:09:20 AM UTC-4, Rick wrote:
Your assessment is generally correct, "nX" and "b" parameters do
indeed correspond to the exogenous input data "x(t)". The number of
columns (i.e., time series) in x(t) is "nX" and is what SAS calls
"r", and the coefficient vector "b" is its regression coefficient.
I think the distinction here, and perhaps your confusion, is that
SAS incorporates exogenous data x(t) as what's generally called a
"distributed lag structure" in which they specify an r-by-T
predictor time series and allow this entire series to be lagged
using lag operator polynomial notation as are the AR and MA
components of the model.
MATLAB's Econometrics Toolbox, adopts a more classical regression
component approach. Any exogenous data is included as a simple
regression component and is not associated with a lag operator
polynomial.
In this convention, if the user wants to include lags of x(t), then
they would simply create the appropriate lag of x(t) and include it
as additional series (i.e., additional columns of a larger
multi-variate exogenous/predictor matrix, say X(t)).
See the utility function LAGMATRIX.
Note that both conventions are perfectly correct. Personally, I feel
that regression component approach is slightly more flexible since
it does not require you to include "s" lags of all series in x(t).
Interesting. I'm still wrapping my brain around the use of regression to determine lag coefficients. It turns out the the multitude of online tutorial info & hard copy library texts that I've looked at haven't really given much explanatory transition between the theoretical projection of new values onto past values versus actual regression using sample data. Your description is making this more concrete. Thank you.
AFTERNOTE: In keeping with the best practice of which I've been advised, I am posting links to the fora that I posed this question in:
http://www.mathworks.com/matlabcentral/newsreader/view_thread/341064
Matlab's VARMAX regression parameters/coefficients nX & b
https://stats.stackexchange.com/questions/152578/matlabs-varmax-regression-parameters-coefficients-nx-b
I need to calculate the log-likelihood of a linear regression model in Matlab (I don't have the newer mle function unfortunately).
I realize that the parameters are the same as ordinary least squares (at least asymptotically), but it's the actual log-likelihood value that I need.
Although the theoretical result is well know and given in several sources, I'd like to find a pre-existing implementation so that I can be confident that it's tried and tested.
The problem is I have no numerical examples against which I can validate my own implementation.
Failing that, if someone can point me to a numerical example of a loglihood calculation for a linear regression model with N(0,sigma I) errors that would be great too. It's easy enough to program, but I can't really trust it unless its been tested.
This question could refer to any computer algebra system which has the ability to compute the Groebner Basis from a set of polynomials (Mathematica, Singular, GAP, Macaulay2, MatLab, etc.).
I am working with an overdetermined system of polynomials for which the full groebner basis is too difficult to compute, however it would be valuable for me to be able to print out the groebner basis elements as they are found so that I may know if a particular polynomial is in the groebner basis. Is there any way to do this?
If you implement Buchberger's algorithm on your own, then you can simply print out the elements as the are found.
If you have Mathematica, you can use this code as your starting point.
https://www.msu.edu/course/mth/496/snapshot.afs/groebner.m
See the function BuchbergerSteps.
Due to the way the Buchberger algorithm works (see, for instance, Wikipedia or IVA), the partial results that you could obtain by printing intermediate results are not guaranteed to constitute a Gröbner basis.
Depending on your ultimate goal, you may want to try instead an algorithm for triangularization of ideals, such as Ritt-Wu's algorithm (see IVA or Shang-Ching Chou's book). This is somewhat similar to reduction to row echelon form in Linear Algebra, and you may interrupt the algorithm at any point to get a partially reduced system of polynomial equations.