Elisp: What is the time complexity for basic arithmetic operations using calc functions - lisp

This includes addition, subtraction, multiplication, and division.
I'm asked to analyze some algorithms that rely heavily on calling calc-eval to work. My teacher does want us to account for the complexity of basic operations when working with large numbers.
How do these arithmetic operations scale as the size of the numbers increase?

Related

Neural Networks w/ Fixed Point Parameters

Most neural networks are trained with floating point weights/biases.
Quantization methods exist to convert the weights from float to int, for deployment on smaller platforms.
Can you build neural networks from the ground up that constrain all parameters, and their updates to be integer arithmetic?
Could such networks achieve a good accuracy?
(I know a bit about fixed-point and have only some rusty NN experience from the 90's so take what I have to say with a pinch of salt!)
The general answer is yes, but it depends on a number of factors.
Bear in mind that floating-point arithmetic is basically the combination of an integer significand with an integer exponent so it's all integer under the hood. The real question is: can you do it efficiently without floats?
Firstly, "good accuracy" is highly dependent on many factors. It's perfectly possible to perform integer operations that have higher granularity than floating-point. For example, 32-bit integers have 31 bits of mantissa while 32-bit floats effectively have only 24. So provided you do not require the added precision that floats give you near zero, it's all about the types that you choose. 16-bit -- or even 8-bit -- values might suffice for much of the processing.
Secondly, accumulating the inputs to a neuron has the issue that unless you know the maximum number of inputs to a node, you cannot be sure what the upper bound is on the values being accumulated. So effectively you must specify this limit at compile time.
Thirdly, the most complicated operation during the execution of a trained network is often the activation function. Again, you firstly have to think about what the range of values are within which you will be operating. You then need to implement the function without the aid of an FPU with all of the advanced mathematical functions it provides. One way to consider doing this is via lookup tables.
Finally, training involves measuring the error between values and that error can often be quite small. Here is where accuracy is a concern. If the differences you are measuring are too low, they will round down to zero and this may prevent progress. One solution is to increase the resolution of the value by providing more fractional digits.
One advantage that integers have over floating-point here is their even distribution. Where floating-point numbers lose accuracy as they increase in magnitude, integers maintain a constant precision. This means that if you are trying to measure very small differences in values that are close to 1, you should have no more trouble than you would if those values were as close to 0. The same is not true for floats.
It's possible to train a network with higher precision types than those used to run the network if training time is not the bottleneck. You might even be able to train the network using floating-point types and run it using lower-precision integers but you need to be aware of differences in behavior that these shortcuts will bring.
In short the problems involved are by no means insurmountable but you need to take on some of the mental effort that would normally be saved by using floating-point. However, especially if your hardware is physically constrained, this can be a hugely benneficial approach as floating-point arithmetic requires as much as 100 times more silicon and power than integer arithmetic.
Hope that helps.

Calculation of hash of a string (MD5, SHA) as a basis for CPU benchmarking

I know that there are many applications and tools available for benching the computational power of CPUs especially in terms of floating point and integer calculations.
What I want to know is that how good is to use the hashing functions such as MD5, SHA, ... for benchmarking CPUs? Does these functions include enough floating point and integer calculations that applying a series of those hashing functions could be a good basis for cpu becnhmarking?
In case platform matters, I'm concerned with Windows and .Net.
MD5 and SHA hash functions do not use floating point at all. They are completely implemented using discrete math

How to find the time complexity of the algebra operation in algebraixlib

How can I calculate time complexity using mathematics or Big O notation for algebra operations used in the algebra of data.
I will use book example to explain my question. Consider following example given in book.
B
In above example I would like to calculate the time complexity of transpose and compose operation.
If it possible I would also like to find out other algebra data operations' time complexity.
Please let me know if you need more explanation.
#wesholler I edited my question to understand you explanation. Following is a real life example and suppose we want to calculate the time complexity for operations used below.
suppose I have algebra of data operations as follows
Could you describe how we would calculate the time complexity in above example. Preferably in Big O?
Thanks
This answer has three parts:
General Time Complexity Analysis
Generally, the time complexity/BigO can be determined by considering the origin of an operation - that is, what operations were extended from more primitive algebras to derive this one?
The following rules describe the upper-bound on the time complexity for both unary and binary operations that are extended into their power set algebras.
Unary extension can be thought of similarly to a map operation and so has linear time complexity. Binary extension evaluates the cross product of the operation's arguments and so has a worst-case time complexity similar to O(n^2). However it is important to consider that the real upper bound is a product of the cardinality of both arguments; this comes up in practice often when the right-hand argument to a composition or superstriction operation is a singleton.
Time Complexity for algebraixlib Implementations
We can take a look at a few examples of how extension affects the time complexity while at the same time analyzing the complexity of the implementations in algebraixlib (the last part talks about other implementations.)
Being that it is a reference implementation for data algebra, algebraixlib implements extended operations very literally. For that reason, Big Theta is used below, because the formulas represent both the lower and upper bounds of the time complexity.
Here is the unary operation transpose being extended from couplets to relations and then to clans.
Likewise, here is the binary operation compose being extended from couplets to relations and then to clans.
It is clear that the complexity of both of the clan operations is influenced by both the number of relation elements as well as the number of couplets in those relations.
Time Complexity for Other Implementations
It is important to note that the above section describes the time complexity that is specific to the algorithms implemented in algebraixlib.
One could imagine implementing e.g. clans.cross_union with a method similar to sort-merge-join or hash-join. In this case, the upper bound would remain the same, but the lower bound (and expected) time complexity would be reduced by one or more degrees.

Matlab `corr` gives different results on the same dataset. Is floating-point calculation deterministic?

I am using Matlab's corr function to calculate the correlation of a dataset. While the results agree within the double point accuracy (<10^-14), they are not exactly the same even on the same computer for different runs.
Is floating-point calculation deterministic? Where is the source of the randomness?
Yes and no.
Floating point arithmetic, as in a sequence of operations +, *, etc. is deterministic. However in this case, linear algebra libraries (BLAS, LAPACK, etc) are most likely being used, which may not be: for example, matrix multiplication is typically not performed as a "triple loop" as some references would have you believe, but instead matrices are split up into blocks that are optimised for maximum performance based on things like cache size. Therefore, you will get different sequences of operations, with different intermediate rounding, which will give slightly different results. Typically, however, the variation in these results is smaller than the total rounding error you are incurring.
I have to admit, I am a little bit surprised that you get different results on the same computer, but it is difficult to know why without knowing what the library is doing (IIRC, Matlab uses the Intel BLAS libraries, so you could look at their documentation).

How can MLE Likelihood evaluations be so different if I break up a log likelihood into its sum?

This is something I noticed in Matlab when trying to do MLE. My first estimator used the log likelihood of a pdf and broke the product up as a sum. For example, a log weibull pdf (f(x)=b ax^(a-1)exp(-bx^a)) broken up is:
log_likelihood=log(b)+log(a)+(a-1)log(x)-bx^a
Evaluating this is wildly different to this:
log_likelihood=log(bax^(a-1)exp(-bx^a))
What is the computer doing differently in the two stages? The first one gives a much larger number (by a couple orders of magnitude).
Depending on the numbers you use, this could be a numerical issue: If you combine very large numbers with very small numbers, you can get inaccuracies due to limitations in number precision.
One possibility is that you lose some accuracy in the second case since you are operating at different scales.
I work on a scientific software project implementing maximum likelihood of phylogenetic trees, and consistently run into issues regarding numerical precision. Often the descepency is ...
between competing applications with the same values in the model,
when calculating the MLE scores by hand,
in the order of the operations in the computation.
It really all comes down to number three, and even in your case. Mulitplication of small and very large numbers can cause weird results when their exponents are scaled during computation. There is a lot about this in the (in)famous "What Every Computer Scientist Should Know About Floating-Point Arithmetic". But, what I've mentioned is the short of it if that's all you are interested in.
Over all, the issue you are seeing are strictly numerical issues in the representation of floating point / double precision numbers and operations when computing the function. I'm not too familiar with MATLAB, but they may have an arbitrary-precision type that would give you better results.
Aside from that, keep them symbolic as long as possible and if you have any intuition about the variables size (as in a is always very large compared to x), then make sure you are choosing the order of parenthesis wisely.
The first equation should be better since it is dealing with adding logs, and should be much more stable then the second --although x^a makes me a bit weary as it would dominate the equation, but it would in practice anyways.