convert number string into float with specific precision (without getting rounding errors) - matlab

I have a vector of cells (say, size of 50x1, called tokens) , each of which is a struct with properties x,f1,f2 which are strings representing numbers. for example, tokens{15} gives:
x: "-1.4343429"
f1: "15.7947111"
f2: "-5.8196158"
and I am trying to put those numbers into 3 vectors (each is also 50x1) whose type is float. So I create 3 vectors:
x = zeros(50,1,'single');
f1 = zeros(50,1,'single');
f2 = zeros(50,1,'single');
and that works fine (why wouldn't it?). But then when I try to populate those vectors: (L is a for loop index)
x(L)=tokens{L}.x;
.. also for the other 2
I get :
The following error occurred converting from string to single:
Conversion to single from string is not possible.
Which I can understand; implicit conversion doesn't work for single. It does work if x, f1 and f2 are of type 50x1 double.
The reason I am doing it with floats is because the data I get is from a C program which writes the some floats into a file to be read by matlab. If I try to convert the values into doubles in the C program I get rounding errors...
So, (after what I hope is a good question,) how might I be able to get the numbers in those strings, at the right precision? (all the strings have the same number of decimal places: 7).
The MCVE:
filedata = fopen('fname1.txt','rt');
%fname1.txt is created by a C program. I am quite sure that the problem isn't there.
scanned = textscan(filedata,'%s','Delimiter','\n');
raw = scanned{1};
stringValues = strings(50,1);
for K=1:length(raw)
stringValues(K)=raw{K};
end
clear K %purely for convenience
regex = 'x=(?<x>[\-\.0-9]*),f1=(?<f1>[\-\.0-9]*),f2=(?<f2>[\-\.0-9]*)';
tokens = regexp(stringValues,regex,'names');
x = zeros(50,1,'single');
f1 = zeros(50,1,'single');
f2 = zeros(50,1,'single');
for L=1:length(tokens)
x(L)=tokens{L}.x;
f1(L)=tokens{L}.f1;
f2(L)=tokens{L}.f2;
end

Use function str2double before assigning into yours arrays (and then cast it to single if you want). Strings (char arrays) must be explicitely converted to numbers before using them as numbers.

Related

Matlab hash table with matrix key

I would like to construct a hash table in Matlab, the keys of which are matrices of different sizes, and the values of which are also matrices. The containers.Map class only allows strings as keys. I can certainly just use a cell for the keys, a cell for the value and match the indices of the two cells. Is there a better way to construct the hash table and the associated hash function?
I just played around with containers.Map a little, it seems that you can use char arrays of any length as keys.
>> a = containers.Map;
>> a(repmat('bla',50,500)) = 1;
>> a(repmat('bla',50,500))
ans =
1
You can also convert any numeric array into a char array as follows:
>> x = randn(4)
x =
-0.7371 -0.0799 0.1129 -1.1667
-1.7499 0.8985 0.4400 -1.8543
0.9105 0.1837 0.1017 -1.1407
0.8671 0.2908 2.7873 -1.0933
>> s = char(typecast(x(:),'uint8')')
s =
''uÔ_þ翼qÿû¿/å\¬"í?éúè#¿ë?.YðjÛs´¿Ó¶Ó·PÀì?+Ç? Õ9NÒ?Üéñé¼?
°À9-(Ü?ç¥ìƺ?NsivL#V*aó¨ªò¿{Ò5«ý¿Q8ß:#ò¿í=µU~ñ¿'
Or using the full 16-bit Unicode values allowed by char:
>> s = char(typecast(x(:),'uint16')')
s =
'疺㓦쁁뿛쓆遫뿅䅀庲뿋ꁰ頳劜㿡礋쮼㿘旈帡਑㿨ﮢ电玼㿼譍৊醪㿳랝趚蠷뿴瞶ꆲ쀂伴愹?㿬ꑨ꬞廆뿽㼝ὧ᛻㾱?ﺳ⩝㾢棑罓턽䀁ᕾ統렆뾱'
So putting these together, it is possible to use any array (properly converted to a char array) as key into a hash table:
>> a(s) = 5;
>> a(s)
ans =
5
And, given the numeric array cast to char, it is possible to cast it back to numeric array as well (though the shape of the array will get lost):
x = randn(1,20);
s = char(typecast(x,'uint8'));
y = typecast(uint8(s),'double');
assert(isequal(x,y)) % does not throw an error
There is another alternative. It is possible to use keys of type different from a string with containers.Map, as stated in the documentation. Keys can be either char arrays, or numeric scalars; they cannot be numeric arrays:
>> a = containers.Map('KeyType','double','ValueType','double');
>> a(5) = 10;
>> a([5,3]) = 5;
Error using containers.Map/subsasgn
Specified key type does not match the type expected for this container.
Thus, you could compute a hash value (as a floating-point double value or 64-bit integer value) from your arrays. How to best do this I don't know, maybe the dot product with a set of random values? At this related question there are some suggestions. There are also some functions on the MATLAB File Exchange that would be helpful (e.g. here and here).

passing varargin to subfunction if string

I want to modify function rand and define my own function
function num = rand(varargin)
Most of the time, i just wrap the invocation
num = builtin("rand", [varargin{:}]);
and this works well except in case there is a string argument.
For rand(2,3,"double") I obtain
warning: implicit conversion from numeric to char
warning: called from rand at line 83 column 11
error: rand: unrecognized string argument
error: called from rand at line 83 column 11
and for rand("seed",2) the same.
ON the other hand, rand("seed") seems to work fine.
Can anyone offer an explanation and a solution?
The syntax:
num = builtin('rand', [varargin{:}]);
Will only work for you in cases where the input arguments can be represented as either a comma-separated list or a vector, such as when you specify a size for rand:
num = rand(2, 3, 4);
% Or ...
num = rand([2 3 4]);
It will not work for inputs that must be entered separately, like so:
num = rand(2, 3, 'double'); % Works
num = rand([2 3 'double']); % Throws an error
In general, you should just pass the contents of varargin as a comma-separated list (without collecting the contents into a vector/matrix) since builtin is designed to handle that just fine:
num = builtin('rand', varargin{:});
Also, be mindful of the difference between "strings" like 'rand' (a character array) and "rand" (a string). They can have different behavior in certain cases.

Scientific notation in MATLAB

Say I have an array that contains the following elements:
1.0e+14 *
1.3325 1.6485 2.0402 1.0485 1.2027 2.0615 1.7432 1.9709 1.4807 0.9012
Now, is there a way to grab 1.0e+14 * (base and exponent) individually?
If I do arr(10), then this will return 9.0120e+13 instead of 0.9012e+14.
Assuming the question is to grab any elements in the array with coefficient less than one. Is there a way to obtain 1.0e+14, so that I could just do arr(i) < 1.0e+14?
I assume you want string output.
Let a denote the input numeric array. You can do it this way, if you don't mind using evalc (a variant of eval, which is considered bad practice):
s = evalc('disp(a)');
s = regexp(s, '[\de+-\.]+', 'match');
This produces a cell array with the desired strings.
Example:
>> a = [1.2e-5 3.4e-6]
a =
1.0e-04 *
0.1200 0.0340
>> s = evalc('disp(a)');
>> s = regexp(s, '[\de+-\.]+', 'match')
s =
'1.0e-04' '0.1200' '0.0340'
Here is the original answer from Alain.
Basic math can tell you that:
floor(log10(N))
The log base 10 of a number tells you approximately how many digits before the decimal are in that number.
For instance, 99987123459823754 is 9.998E+016
log10(99987123459823754) is 16.9999441, the floor of which is 16 - which can basically tell you "the exponent in scientific notation is 16, very close to being 17".
Now you have the exponent of the scientific notation. This should allow you to get to whatever your goal is ;-).
And depending on what you want to do with your exponent and the number, you could also define your own method. An example is described in this thread.

format a number as I want

I have the number: a = 3.860575156847749e+003; and I would show it in a normal manner. So I write b = sprintf('%0.1f' a);. If I print b I will get: 3860.6. This is perfect. Matter of fact, while a is a double type, b has been converted in char.
What can I do to proper format that number and still have a number as final result?
Best regards
Well, you have to distinguish between both the numerical value (the number stored in your computer's memory) and its decimal representation (the string/char array you see on your screen). You can't really impose a format on a number: a number has a value which can be represented as a string in different ways (e.g. 1234 = 1.234e3 = 12.34e2 = 0.1234e4 = ...).
If you want to store a number with less precision, you can use round, floor, ceil to calculate a number which has less precision than the original number.
E.g. if you have a = 3.860575156847749e+003 and you want a number that only has 5 significant digits, you can do so by using round:
a = 3.860575156847749e+003;
p = 0.1; % absolute precision you want
b = p .* round(a./p)
This will yield a variable b = 3.8606e3 which can be represented in different ways, but should contain zeros (in practice: very small values are sometimes unavoidable) after the fifth digit. I think that is what you actually want, but remember that for a computer this number is equal to 3.86060000 as well (it is just another string representation of the same value), so I want to stress again that the decimal representation is not set by rounding the number but by (implicitly) calling a function that converts the double to a string, which happens either by sprintf, disp or possibly some other functions.
Result of sprintf y a text variable. have you tried to declare a variable as integer (for example) and use this as return value for sprintf instruction?
This can be useful to you: http://blogs.mathworks.com/loren/2006/12/27/displaying-numbers-in-matlab/

What are # and : used for in Qbasic?

I have a legacy code doing math calculations. It is reportedly written in QBasic, and runs under VB6 successfully. I plan to write the code into a newer language/platform. For which I must first work backwards and come up with a detailed algorithm from existing code.
The problem is I can't understand syntax of few lines:
Dim a(1 to 200) as Double
Dim b as Double
Dim f(1 to 200) as Double
Dim g(1 to 200) as Double
For i = 1 to N
a(i) = b: a(i+N) = c
f(i) = 1#: g(i) = 0#
f(i+N) = 0#: g(i+N) = 1#
Next i
Based on my work with VB5 like 9 years ago, I am guessing that a, f and g are Double arrays indexed from 1 to 200. However, I am completely lost about this use of # and : together inside the body of the for-loop.
: is the line continuation character, it allows you to chain multiple statements on the same line. a(i) = b: a(i+N) = c is equivalent to:
a(i)=b
a(i+N)=c
# is a type specifier. It specifies that the number it follows should be treated as a double.
I haven't programmed in QBasic for a while but I did extensively in highschool. The # symbol indicates a particular data type. It is to designate the RHS value as a floating point number with double precision (similar to saying 1.0f in C to make 1.0 a single-precision float). The colon symbol is similar to the semicolon in C, as well, where it delimits different commands. For instance:
a(i) = b: a(i+N) = c
is, in C:
a[i] = b; a[i+N] = c;