is float16 supported in matlab? - matlab

Does MATLAB support float16 operations? If so, how to convert a double matrix to float16? I am doing an arithmetic operation on a large matrix where 16-bit floating representation is sufficient for my representation. Representing by a double datatype takes 4 times more memory.

Is your matrix full? Otherwise, try sparse -- saves a lot of memory if there's lots of zero-valued elements.
AFAIK, float16 is not supported. Lowest you can go in float-datatype is with single, which is a 32-bit datatype:
A = single( rand(50) );
You could multiply by a constant and cast to int16, but you'd lose precision.

The numeric classes, which Matlab supports out of the box are the following:
int8
int16
int32
int64
uint8
uint16
uint32
uint64
single (32-bit float)
double (64-bit float)
plus the complex data type. So no 16-bit floats, unfortunately.
On Mathworks file exchange, there seems to be a half-precision float library. It requires MEX, however.

This might be an old question, but I found it in search for a similiar problem (half precision in matlab).
Things seemed to have changed in time:
https://www.mathworks.com/help/fixedpoint/ref/half.html
Half-precision seems to be supported nativeley by matlab now.

Related

What are scenarios where you should use Float in Swift?

Learning about the difference between Floats and Doubles in Swift. I can't think of any reasons to use Float. I know there are, and I know I am just not experienced enough to understand them.
So my question is why would you use float in Swift?
why would you use float in Swift
Left to your own devices, you likely never would. But there are situations where you have to. For example, the value of a UISlider is a Float. So when you retrieve that number, you are working with a Float. It’s not up to you.
And so with all the other numerical types. Swift includes a numerical type corresponding to every numerical type that you might possibly encounter as you interface with Cocoa and the outside world.
Float is a typealias for Float32. Float32 and Float16 are incredibly useful for GPU programming with Metal. They both will feel as archaic someday on the GPU as they do on the CPU, but that day is years off.
https://developer.apple.com/metal/
Double
Represents a 64-bit floating-point number.
Has a precision of at least 15 decimal digits.
Float
Float represents a 32-bit floating-point number.
precision of Float can be as little as 6 decimal digits.
The appropriate floating-point type to use depends on the nature and range of values you need to work with in your code. In situations where either type would be appropriate, Double is preferred.

Does converting UInt8(or similar types) to Int counter the purpose of UInt8?

I'm storing many of the integers in my program as UInt8, having a 0 - 255 range of values. Now later on I will be summing many of them to get a result that will be able to be stored into an Int. Does this conversion I have to do before I add the values from UInt8 to Int defeat the purpose of me using a smaller datatype to begin with? I feel it would be faster to just use just Int, but suffer larger a memory footprint. But why go for UInt8 when I have to face many conversions reducing of speed and increasing memory as well. Is there something I'm missing, or should smaller datatypes be really only used with other small datatypes?
You are talking a few bytes per variable when storing as UInt8 instead of Int. These data types were conceived very early on in the history of computing, when memory was measured in the low KBs. Even the Apple Watch has 512MB.
Here's what Apple says in the Swift Book:
Unless you need to work with a specific size of integer, always use Int for integer values in your code. This aids code consistency and interoperability. Even on 32-bit platforms, Int can store any value between -2,147,483,648 and 2,147,483,647, and is large enough for many integer ranges.
I use UInt8, UInt16 and UInt32 mainly in code that deals with C. And yes, converting back and forth is a pain in the neck.

LReal Vs Real Data Types

In PLC Structure Text what is main difference between a LReal Vs a Real data type? Which would you use when replacing a double or float when converting from a C based language to structure text with a PLC
LReal is a double precision real, float, or floating point variables that is a 64 bit signed value rather then a real is a single precision real, float, or floating point that is made from a 32 bit signed value. So it stores more in a LReal which makes LReal closer to a Double and a Float. The other thing to keep in mind is depending on the PLC it will convert a Real into a LReal for calculations. Plus a LReal is limited to 15 decimal places rather than a Real is 9 decimal places. So if you need more then 9 decimal places I would recommend LReal but if you need less I would stick with Real because with LReals you have to convert from a Integer to a Real to a LReal. So it would save you a step.

More precision than double in swift

Are there are any floating points more accurate than Double available in Swift? I know that in C there is the long double, but I can't seem to find its equivalent in Apple's new programming language.
Any help would be greatly appreciated!
Yes there is! There is Float80 exactly for that, it stores 80 bits (duh), 10 bytes. You can use it like any other floating point type. Note that there are Float32, Float64 and Float80 in Swift, where Float32 is just a typealias for Float and Float64 is one for Double
Currently iOS 11+ runs on 64 Bit platform, Double holds Highest among all.
Double has a precision of at least 15 decimal digits, whereas the
precision of Float can be as little as 6 decimal digits. The
appropriate floating-point type to use depends on the nature and range
of values you need to work with in your code. In situations where
either type would be appropriate, Double is preferred.
However in CGFloat The native type used to store the CGFloat, which is Float on 32-bit architectures and Double on 64-bit architectures
https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/TheBasics.html

Integer to Binary Conversion in Simulink

I know its a very basic question. But still, I am struggling to convert Binary to Integer and vice-versa in Simulink.
I could use a function block and use inbuilt Matlab functions to do it. But I, intend to use the Simulink blocks to convert Binary to decimal number.
Please suggest me how to do it or any pointers in the internet would be helpful.
You can use a Conversion block to convert back and forth between binary (i.e. boolean) types and various integer (int8, uint8, int16, etc.) or floating point (single or double) types.
I think this is what you're looking for:
How do I visualize the fixed-point data in binary or hex format using Simulink?