Binary Arithmetic Coding and Arithmetic Coding Difference - encoding

What is the difference binary arithmetic coding and arithmetic coding?
I don't understand these two notions. I think, these algorithms are same. Can you give details and examples for these notions and difference?

You should link the source or sources of what you're trying to compare, so we can see what exactly they're talking about.
However I can guess that binary arithmetic coding is specifically referring to encoding a 0 or 1 at each step, i.e. two symbols. You can also have q-ary arithmetic coding, where you split the probability into q intervals at each step, instead of two intervals at each step.

Related

How to resolve Underflow (/Overflow) issues in Fixed-Point tool of MATLAB Simulink?

Thanks in advance
I am working on a Simulink model which involves Floating-point data types. So using the Fixed-Point tool available in Simulink, I am trying to convert my floating-point system to a fixed-point one. I am following the tutorial available here to achieve the conversion.
Link to the tutorial on converting the floating-point system to the fixed point
In the Data type proposing step, I got underflow values for some of the variables. My question is how to convert those underflow values as well in-range. Or can I ignore them and proceed with further steps? In general how to tackle this type of underflow/overflow issue?
Using fixed-point arithmetic can be faster and use less resources than floating-point arithmetic, but a significant disadvantage is that underflow and overflow are not handled gracefully. If you try to detect and recover from these conditions you will lose much of the advantage provided by fixed-point.
In practice, you should select a fixed-point format for your variables that provides enough bits for the integer part (the bits to the left of the radix point) so that overflow cannot occur. This requires careful analysis of your algorithms and the potential ranges of all variables. Your format should also provide enough fraction bits (to the right of the radix point) so that underflows do not cause significant problems with your algorithm.

Does unary minus just change sign?

Consider for example the following double-precision numbers:
x = 1232.2454545e-89;
y = -1232.2454545e-89;
Can I be sure that y is always exactly equal to -x (or Matlab's uminus(x))? Or should I expect small numerical differences of the order or eps as it often happens with numerical computations? Try for example sqrt(3)^2-3: the result is not exactly zero. Can that happen with unary minus as well? Is it lossy like square root is?
Another way to put the question would be: is a negative numerical literal always equal to negating its positive counterpart?
My question refers to Matlab, but probably has more to do with the IEEE 754 standard than with Matlab specifically.
I have done some tests in Matlab with a few randomly selected numbers. I have found that, in those cases,
They turn out to be equal indeed.
typecast(x, 'uint8') and typecast(-x, 'uint8') differ only in the sign bit as defined by IEEE 754 double-precision format.
This suggests that the answer may be affirmative. If applying unary minus only changes the sign bit, and not the significand, no precision is lost.
But of course I have only tested a few cases. I'd like to be sure this happens in all cases.
This question is computer architecture dependent. However, the sign of floating point numbers on modern architectures (including x64 and ARM cores) is represented by a single sign bit, and they have instructions to flip this bit (e.g. FCHS). That being the case, we can draw two conclusions:
A change of sign can be achieved (and indeed is by modern compilers and architectures) by a single bit flip/instruction. This means that the process is completely invertible, and there is no loss of numerical accuracy.
It would make no sense for MATLAB to do anything other than the fastest, most accurate thing, which is just to flip that bit.
That said, the only way to be sure would be to inspect the assembly code for uminus in your MATLAB installation. I don't know how to do this.

Matlab `corr` gives different results on the same dataset. Is floating-point calculation deterministic?

I am using Matlab's corr function to calculate the correlation of a dataset. While the results agree within the double point accuracy (<10^-14), they are not exactly the same even on the same computer for different runs.
Is floating-point calculation deterministic? Where is the source of the randomness?
Yes and no.
Floating point arithmetic, as in a sequence of operations +, *, etc. is deterministic. However in this case, linear algebra libraries (BLAS, LAPACK, etc) are most likely being used, which may not be: for example, matrix multiplication is typically not performed as a "triple loop" as some references would have you believe, but instead matrices are split up into blocks that are optimised for maximum performance based on things like cache size. Therefore, you will get different sequences of operations, with different intermediate rounding, which will give slightly different results. Typically, however, the variation in these results is smaller than the total rounding error you are incurring.
I have to admit, I am a little bit surprised that you get different results on the same computer, but it is difficult to know why without knowing what the library is doing (IIRC, Matlab uses the Intel BLAS libraries, so you could look at their documentation).

How can MLE Likelihood evaluations be so different if I break up a log likelihood into its sum?

This is something I noticed in Matlab when trying to do MLE. My first estimator used the log likelihood of a pdf and broke the product up as a sum. For example, a log weibull pdf (f(x)=b ax^(a-1)exp(-bx^a)) broken up is:
log_likelihood=log(b)+log(a)+(a-1)log(x)-bx^a
Evaluating this is wildly different to this:
log_likelihood=log(bax^(a-1)exp(-bx^a))
What is the computer doing differently in the two stages? The first one gives a much larger number (by a couple orders of magnitude).
Depending on the numbers you use, this could be a numerical issue: If you combine very large numbers with very small numbers, you can get inaccuracies due to limitations in number precision.
One possibility is that you lose some accuracy in the second case since you are operating at different scales.
I work on a scientific software project implementing maximum likelihood of phylogenetic trees, and consistently run into issues regarding numerical precision. Often the descepency is ...
between competing applications with the same values in the model,
when calculating the MLE scores by hand,
in the order of the operations in the computation.
It really all comes down to number three, and even in your case. Mulitplication of small and very large numbers can cause weird results when their exponents are scaled during computation. There is a lot about this in the (in)famous "What Every Computer Scientist Should Know About Floating-Point Arithmetic". But, what I've mentioned is the short of it if that's all you are interested in.
Over all, the issue you are seeing are strictly numerical issues in the representation of floating point / double precision numbers and operations when computing the function. I'm not too familiar with MATLAB, but they may have an arbitrary-precision type that would give you better results.
Aside from that, keep them symbolic as long as possible and if you have any intuition about the variables size (as in a is always very large compared to x), then make sure you are choosing the order of parenthesis wisely.
The first equation should be better since it is dealing with adding logs, and should be much more stable then the second --although x^a makes me a bit weary as it would dominate the equation, but it would in practice anyways.

Fixed point in Matlab

Can someone please explain this?
As I understand it provides less precision..Is it a speed-up that one wishes to get by using it? When is it good to use? Should I use it in Matlab Coder?
Not all the computers in the world use floating-point arithmetic. In particular, many devices which have a connection to the world (such as sensors and the computers which process their data) use fixed-point representations of numbers. Some researchers into algorithms and similar matters also want to use fixed-point numbers. Matlab's fixed-point toolbox allows its users to do fixed-point arithmetic on their PCs, and to write code targeted at execution on devices which implement it.
It's not (necessarily) true that the Matlab fixed-point arithmetic provides less precision, it can be used to provide more precision than IEEE floating-point types.
Is it a speed up ? That's beside the point. (Read on)
When is it good to use ? When you need to use fixed-point arithmetic. I'm not sure anyone would recommend it as a general-purpose replacement for floating-point arithmetic.
Should you use it ? Your question suggests that the answer is almost certainly 'No, you would already know that you ought to be using fixed-point arithmetic.'