MAAB guideline : na_0002: Appropriate implementation of fundamental logical and numerical operations - matlab

The MAAB guideline [na_0002: Appropriate implementation offundamental logical and numerical operations][1]. Indicates that logical data type shouldn't be used in numerical operations. Among the rationales listed, we can see Code generation, but i can't understand how using logical data type in numerical operations could affect negatively Code generation, since a cast is done for the logical data type ?
Any help please.
Regards

You shouldn't add, subtract, multiply or divide numbers by logicals (booleans). It just doesn't make sense.
Should the result of a math operation with a logic operation give a numerical or a logical result, for example - it's ambiguous.

Related

Why "time==0.5" isn't a discrete expression in Modelica language?

I build a simple model to understand the concept of "Discrete expressions", here is the code:
model Trywhen
parameter Real B[ :] = {1.0, 2.0, 3.0};
algorithm
when time>=0.5 then
Modelica.Utilities.Streams.print("message");
end when;
annotation (uses(Modelica(version="3.2.3")));
end Trywhen;
But when checking the model, I got an error showing that "time==0.5" isn't a discrete expression.
If I change time==0.5 to time>=0.5, the model would pass the check.
And if I use if-clause to when-clause, the model works fine but with a warning showing that "Variables of type Real cannot be compared for equality."
My questions are:
Why time==0.5 is NOT a discrete expression?
Why Variables of type Real cannot be compared for equality? It seems common when comparing two variables of type Real.
The first question is not important, since time==0.5 is not allowed.
The second question is the important one:
Comparing reals for equality is common in other languages, and also a common source of errors - unless special care is taken.
Merely using the floating point compare of the processor is a really bad idea on idea on some processors (like Intel) that mix 80-bit and 64-bit floating point numbers (or comes with a performance penalty), and also in other cases it may not work as intended. In this case 0.5 can be represented as a floating point number, but 0.1 and 0.2 cannot.
Often abs(x-y)<eps is a good alternative, but it depends on the intended use and the eps depends on additional factors; not only machine precision but also which algorithm is used to compute x and y and its error propagation.
In Modelica the problems are worse than in many other languages, since tools are allowed to optimize expressions a lot more (including symbolic manipulations) - which makes it even harder to figure out a good value for eps.
All those problems mean that it was decided to not allow comparison for equality - and require something more appropriate.
In particular if you know that you will only approach equality from one direction you can avoid many of the problems. In this case time is increasing, so if it has been >0.5 at an event it will not be <=0.5 at a later event, and when will only trigger the first time the expression becomes true.
Therefore when time>=0.5 will only trigger once, and will trigger about when time==0.5, so it is a good alternative. However, there might be some numerical inaccuracies and thus it might trigger at 0.500000000000001.

How do you choose an optimal PlainModulus in SEAL?

I am currently learning how to use SEAL and in the parameters for BFV scheme there was a helper function for choosing the PolyModulus and CoeffModulus and however this was not provided for choosing the PlainModulus other than it should be either a prime or a power of 2 is there any way to know which optimal value to use?
In the given example the PlainModulus was set to parms.PlainModulus = new SmallModulus(256); Is there any special reason for choosing the value 256?
In BFV, the plain_modulus basically determines the size of your data type, just like in normal programming when you use 32-bit or 64-bit integers. When using BatchEncoder the data type applies to each slot in the plaintext vectors.
How you choose plain_modulus matters a lot: the noise budget consumption in multiplications is proportional to log(plain_modulus), so there are good reasons to keep it as small as possible. On the other hand, you'll need to ensure that you don't get into overflow situations during your computations, where your encrypted numbers exceed plain_modulus, unless you specifically only care about correctness of the results modulo plain_modulus.
In almost all real use-cases of BFV you should want to use BatchEncoder to not waste plaintext/ciphertext polynomial space, and this requires plain_modulus to be a prime. Therefore, you'll probably want it to be a prime, except in some toy examples.

Change default numeric type to float in matlab

Matlab by default uses double as the numeric type. I am training a GMM and running out of memory, so I want to change the default numeric type to float which takes half the memory as double. Is it possible?
I know that single(A) converts a double precision element A to single precision but we need to allocate double precision storage for A first which runs out of memory. Also, I cannot use single() around all my matrix allocation as various functions in many toolboxes are called which I cannot change manually.
So is there a way that calling zeros(n) will allocate a matrix of floats by default instead of double ?
No, there is currently no way to change the default numeric type to float / single. See these informative posts on MathWorks forums:
http://www.mathworks.com/matlabcentral/answers/8727-single-precision-by-default-lots-of-auxiliary-variables-to-cast
http://www.mathworks.com/matlabcentral/answers/9591-is-there-a-way-to-change-matlab-defaults-so-that-all-workspace-floating-point-values-to-be-stored-i
Also, quoting John D'Errico on the first link I referenced - a formidable and legendary MATLAB expert:
This is not possible in MATLAB. Anyway, it is rarely a good idea to work in single. It is actually slower in many cases anyway. The memory saved is hardly worth it compared to the risk of the loss in precision. If you absolutely must, use single on only the largest arrays.
As such, you should probably consider reformulating your algorithm if you are using so much memory. If you are solving linear systems that are quite large and there are many zero coefficients, consider using sparse to reduce your memory requirements.
Besides which, doing this would be dangerous because there may be functions in other toolboxes that rely on the fact that double type allocation of matrices is assumed and spontaneously changing these to single may have unintended consequences.
As #rayryeng said, there's no way in MATLAB to "change the default numeric type" to single. I'm not even entirely sure what that would mean.
However, you asked a specific question as well:
So is there a way that calling zeros(n) will allocate a matrix of floats by default instead of double?
Yes - you can use zeros(n, 'single'). That will give you an array of zeros of type single. zeros(n) is just a shorthand for zeros(n, 'double'), and you can ask for any other numeric type you want as well, such as uint8 or int64. The other array creation functions such as ones, rand, randn, NaN, inf, and eye support similar syntaxes.
Note that operations carried out on arrays of single type may not always return outputs of type single (so you may need to subsequently cast them to single), and they may use intermediate arrays that are not of type single (so you may not always get all the memory advantages you might hope for). For example, many functions in Image Processing Toolbox will accept inputs of type single, but will then internally convert to double in order to carry out the operations. The functions from Statistics Toolbox to fit GM models do appear to accept inputs of type single, but I don't know what they do internally.

Efficient Function to Map (or Hash) Integers and Integer Ranges into Index

We are looking for the computationally simplest function that will enable an indexed look-up of a function to be determined by a high frequency input stream of widely distributed integers and ranges of integers.
It is OK if the hash/map function selection itself varies based on the specific integer and range requirements, and the performance associated with the part of the code that selects this algorithm is not critical. The number of integers/ranges of interest in most cases will be small (zero to a few thousand). The performance critical portion is in processing the incoming stream and selecting the appropriate function.
As a simple example, please consider the following pseudo-code:
switch (highFrequencyIntegerStream)
case(2) : func1();
case(3) : func2();
case(8) : func3();
case(33-122) : func4();
...
case(10,000) : func40();
In a typical example, there would be only a few thousand of the "cases" shown above, which could include a full range of 32-bit integer values and ranges. (In the pseudo code above 33-122 represents all integers from 33 to 122.) There will be a large number of objects containing these "switch statements."
(Note that the actual implementation will not include switch statements. It will instead be a jump table (which is an array of function pointers) or maybe a combination of the Command and Observer patterns, etc. The implementation details are tangential to the request, but provided to help with visualization.)
Many of the objects will contain "switch statements" with only a few entries. The values of interest are subject to real time change, but performance associated with managing these changes is not critical. Hash/map algorithms can be re-generated slowly with each update based on the specific integers and ranges of interest (for a given object at a given time).
We have searched around the internet, looking at Bloom filters, various hash functions listed on Wikipedia's "hash function" page and elsewhere, quite a few Stack Overflow questions, abstract algebra (mostly Galois theory which is attractive for its computationally simple operands), various ciphers, etc., but have not found a solution that appears to be targeted to this problem. (We could not even find a hash or map function that considered these types of ranges as inputs, much less a highly efficient one. Perhaps we are not looking in the right places or using the correct vernacular.)
The current plan is to create a custom algorithm that preprocesses the list of interesting integers and ranges (for a given object at a given time) looking for shifts and masks that can be applied to input stream to help delineate the ranges. Note that most of the incoming integers will be uninteresting, and it is of critical importance to make a very quick decision for as large a percentage of that portion of the stream as possible (which is why Bloom filters looked interesting at first (before we starting thinking that their implementation required more computational complexity than other solutions)).
Because the first decision is so important, we are also considering having multiple tables, the first of which would be inverse masks (masks to select uninteresting numbers) for the easy to find large ranges of data not included in a given "switch statement", to be followed by subsequent tables that would expand the smaller ranges. We are thinking this will, for most cases of input streams, yield something quite a bit faster than a binary search on the bounds of the ranges.
Note that the input stream can be considered to be randomly distributed.
There is a pretty extensive theory of minimal perfect hash functions that I think will meet your requirement. The idea of a minimal perfect hash is that a set of distinct inputs is mapped to a dense set of integers in 1-1 fashion. In your case a set of N 32-bit integers and ranges would each be mapped to a unique integer in a range of size a small multiple of N. Gnu has a perfect hash function generator called gperf that is meant for strings but might possibly work on your data. I'd definitely give it a try. Just add a length byte so that integers are 5 byte strings and ranges are 9 bytes. There are some formal references on the Wikipedia page. A literature search in ACM and IEEE literature will certainly turn up more.
I just ran across this library I had not seen before.
Addition
I see now that you are trying to map all integers in the ranges to the same function value. As I said in the comment, this is not very compatible with hashing because hash functions deliberately try to "erase" the magnitude information in a bit's position so that values with similar magnitude are unlikely to map to the same hash value.
Consequently, I think that you will not do better than an optimal binary search tree, or equivalently a code generator that produces an optimal "tree" of "if else" statements.
If we wanted to construct a function of the type you are asking for, we could try using real numbers where individual domain values map to consecutive integers in the co-domain and ranges map to unit intervals in the co-domain. So a simple floor operation will give you the jump table indices you're looking for.
In the example you provided you'd have the following mapping:
2 -> 0.0
3 -> 1.0
8 -> 2.0
33 -> 3.0
122 -> 3.99999
...
10000 -> 42.0 (for example)
The trick is to find a monotonically increasing polynomial that interpolates these points. This is certainly possible, but with thousands of points I'm certain you'ed end up with something much slower to evaluate than the optimal search would be.
Perhaps our thoughts on hashing integers can help a little bit. You will also find there a hashing library (hashlib.zip) based on Bob Jenkins' work which deals with integer numbers in a smart way.
I would propose to deal with larger ranges after the single cases have been rejected by the hashing mechanism.

Using filter function to generate missing data

I have vectors of data that I feed through the filter() function -- said filter was constructed to emit a reasonable approximation of the original signal that is then used to identify "bad" elements in the original data (said elements are typically caused by infrequent short-duration sensor malfunctions and are quite distinct from good data). After identifying these bad elements, I want to go back and replace them with something reasonable.
One approach would be to replace the bad values with the filtered output; however, the output was generated with the bad values, so it has some amount of undesired distortion.
Ideally, I'd like a way to tell filter() to assume that the bad element[s] are missing and that it should instead generate a reasonable interpolation of the missing value[s] (e.g., based on the surrounding values and the properties of the filter) for use when constructing the output.
I've been told that certain toolboxes allow insertion of special values (e.g., NaN) to indicate missing (but assumed to be well-behaved) data.
I looked at the source code for Octave's filter() and nothing obvious leapt out at me -- is there a special value (or other mechanism) to tell filter() to assume that well-behaved data is missing (and should be inserted as needed)?
Inserting NaN won't work for this. The filter function is pretty simple--it simply implements an IIR filter.
If your signal is smooth and slowly-changing, you might get away with simply using interp1 to interpolate new values for the bad stretches based on the good data on either side.
If your signal has more complicated spectral content, I think "Wiener interpolation" is the phase to google for. For extrapolation you can use linear predictive coding.