I am trying to sum very large numbers in MATLAB, such as e^800 and e^1000 and obtain an answer.
I know that in Double-Precision, the largest number I can represent is 1.8 * 10^308, otherwise I get Inf, which I am getting when trying to sum these numbers.
My question is, how do I go about estimating an answer for sums of very, very large numbers like these without using vpa, or some other toolbox?
Should I use strings? It is possible to do this using logs? Can I represent the floats as m x 2^E and if so, how do I take a number such as e^700 and convert it to that? If the number is larger than the threshold for Inf, should I divide it by two, and store it in two different variables?
For example, how would I obtain an approximate answer for:
e^700 + e^800 + e^900 + e^1000 ?
A possible approximation is to use the rounded values of these numbers (I personally used Wolfram|Alpha), then perform "long addition" as they teach in elementary school:
function sumStr = q57847408()
% store rounded values as string:
e700r = "10142320547350045094553295952312676152046795722430733487805362812493517025075236830454816031618297136953899163768858065865979600395888785678282243008887402599998988678389656623693619501668117889366505232839133350791146179734135738674857067797623379884901489612849999201100199130430066930357357609994944589";
e800r = "272637457211256656736477954636726975796659226578982795071066647118106329569950664167039352195586786006860427256761029240367497446044798868927677691427770056726553709171916768600252121000026950958713667265709829230666049302755903290190813628112360876270335261689183230096592218807453604259932239625718007773351636778976141601237086887204646030033802";
e900r = "7328814222307421705188664731793809962200803372470257400807463551580529988383143818044446210332341895120636693403927733397752413275206079839254190792861282973356634441244426690921723184222561912289431824879574706220963893719030715472100992004193705579194389741613195142957118770070062108395593116134031340597082860041712861324644992840377291211724061562384383156190256314590053986874606962229";
e1000r = "197007111401704699388887935224332312531693798532384578995280299138506385078244119347497807656302688993096381798752022693598298173054461289923262783660152825232320535169584566756192271567602788071422466826314006855168508653497941660316045367817938092905299728580132869945856470286534375900456564355589156220422320260518826112288638358372248724725214506150418881937494100871264232248436315760560377439930623959705844189509050047074217568";
% pad to the same length with zeros on the left:
padded = pad([e700r; e800r; e900r; e1000r], 'left', '0');
% convert the padded value to an array of digits:
dig = uint8(char(padded) - '0');
% some helpful computations for later:
colSum = [0 uint8(sum(dig, 1))]; % extra 0 is to prevent overflow of MSB
remainder = mod(colSum, 10);
carry = idivide(colSum, 10, 'floor');
while any(carry) % can also be a 'for' loop with nDigit iterations (at most)
result = remainder + circshift(carry, -1);
remainder = mod(result, 10);
carry = idivide(result, 10, 'floor');
end
% remove leading zero (at most one):
if ~result(1)
result = result(2:end);
end
% convert result back to string:
sumStr = string(char(result + '0'));
This gives the (rounded) result of:
197007111401704699388887935224332312531693805861198801302702004327171116872054081548301452764017301057216669857236647803717912876737392925607579016038517631441936559738211677036898431095605804172455718237264052427496060405708350697523284591075347592055157466708515626775854212347372496361426842057599220506613838622595904885345364347680768544809390466197511254544019946918140384750254735105245290662192955421993462796807599177706158188
Typos fixed from before.
Decimal Approximation:
function [m, new_exponent] = base10_mantissa_exponent(base, exponent)
exact_exp = exponent*log10(abs(base));
new_exponent = floor(exact_exp);
m = power(10, exact_exp - new_exponent);
end
So the value e600 would become 3.7731 * 10260.
And the value 117150 would become 1.6899 * 10310.
To add these two values together, I took the difference between the two exponents and divided the mantissa of the smaller term by it. Then it's just as simple as adding the mantissas together.
mantissaA = 3.7731;
exponentA = 260;
mantissaB = 1.6899;
exponentB = 310;
diff = abs(exponentA - exponentB);
if exponentA < exponentB
mantissaA = mantissaA / (10^diff);
finalExponent = exponentB;
elseif exponentB < exponentA
mantissaB = mantissaB / (10^diff);
finalExponent = exponentA;
end
finalMantissa = mantissaA + mantissaB;
This was important for me as I was performing sums such as:
(Σ ex) / (Σ xex)
From x=1 to x=1000.
Related
It seems tricky to use Matlab's fixed point quantizer, probably some hidden parameters. Here is an example:
b = fi(pi, 1, 21, 16) % use total length = 21-bit, fraction = 16-bit
ntBP = numerictype(1, 10, 9) % quantize to total length = 10-bit, fraction = 9-bit
c = quantize(b, ntBP) % this returns -0.8594
Looks like the problem is matlab didn't take care of the sign bit:
bin(b) % return '000110010010000111111'
bin(c) % return '1001001000'. This is why the quantizer output is negative
The question is, is this the expected way to use the quantizer or not?
Thanks
I am using MATLAB R2020a on a MacOS. I am trying to remove outlier values in a while loop. This involves calculating an exponentially weighted moving mean and then comparing this a vector value. If the conditions are met, the vector input is then added to a separate vector of 'acceptable' values. The while loop then advances to the next input and calculates the new exponentially weighted moving average which includes the newly accepted vector input.
However, if the condition is not met, I written code so that, instead of adding the input sample, a zero is added to the vector of 'acceptable' values. Upon the next acceptable value being added, I currently have it so the zero immediately before is replaced by the mean of the 2 framing acceptable values. However, this only accounts for one past zero and not for multiple outliers. Replacing with a framing mean may also introduce aliaising errors.
Is there any way that the zeros can instead be replaced by linearly interpolating the "candidate outlier" point using the gradient based on the framing 2 accepted vector input values? That is, is there a way of counting backwards within the while loop to search for and replace zeros as soon as a new 'acceptable' value is found?
I would very much appreciate any suggestions, thanks in advance.
%Calculate exponentially weighted moving mean and tau without outliers
accepted_means = zeros(length(cycle_periods_filtered),1); % array for accepted exponentially weighted means
accepted_means(1) = cycle_periods_filtered(1);
k = zeros(length(cycle_periods_filtered),1); % array for accepted raw cycle periods
m = zeros(length(cycle_periods_filtered), 1); % array for raw periods for all cycles with outliers replaced by mean of framing values
k(1) = cycle_periods_filtered(1);
m(1) = cycle_periods_filtered(1);
tau = m/3; % pre-allocation for efficiency
i = 2; % index for counting through input signal
j = 2; % index for counting through accepted exponential mean values
n = 2; % index for counting through raw periods of all cycles
cycle_index3(1) = 1;
while i <= length(cycle_periods_filtered)
mavCurrent = (1 - 1/w(j))*accepted_means(j - 1) + (1/w(j))*cycle_periods_filtered(i);
if cycle_periods_filtered(i) < 1.5*(accepted_means(j - 1)) && cycle_periods_filtered(i) > 0.5*(accepted_means(j - 1)) % Identify high and low outliers
accepted_means(j) = mavCurrent;
k(j) = cycle_periods_filtered(i);
m(n) = cycle_periods_filtered(i);
cycle_index3(n) = i;
tau(n) = m(n)/3;
if m(n - 1) == 0
m(n - 1) = (k(j) + k(j - 1))/2;
tau(n - 1) = m(n)/3;
end
j = j + 1;
n = n + 1;
else
m(n) = 0;
n = n + 1;
end
i = i + 1;
end
% Scrap the tail
accepted_means(j - 1:end)=[];
k(j - 1:end) = [];
Is there a quick and easy way to truncate a decimal number, say beyond 4 digits, in MATLAB?
round() isn't helping, it's still rounding off. I have to use it in for loop, so the quickest way is appreciated.
Thanks for your inputs.
Here's one method to truncate d digits after the decimal.
val = 1.234567;
d = 4;
val_trunc = fix(val*10^d)/10^d
Result
val_trunc =
1.2345
If you know that val is positive then floor() will work in place of fix().
Yet another option:
x = -3.141592653;
x_trun = x - rem(x,0.0001)
x_trun =
-3.1415
Kudos to gnovice for the update.
In general, for n decimal places:
x_trun = x - rem(x,10^-n)
Truncating is like rounding if you subtract 5 from the decimal after the last you want to keep.
So, to truncate x to n decimal figures use
round(x - sign(x)*.5/10^n, n)
(Thanks to #gnovice for noticing the need of sign(x) in order to deal with negative numbers.)
For example,
format long
x = 3.141592653589793;
for n = 2:5
result = round(x - sign(x)*.5/10^n, n);
disp(result)
end
gives
3.140000000000000
3.141000000000000
3.141500000000000
3.141590000000000
As you asked for the fastest method, I've put together a quick benchmark of the top 3 truncation methods currently answered here. Please see the code below. I increased the size of the x vector to be rounded, using the timeit function for timing.
function benchie()
% Set up iteration variables
K = 17; n = 4; T = zeros(K,3);
for k = 1:K
x = rand(2^k,1);
% Define the three truncation functions
LuisRound = #() round(x - 0.5/10^n, n);
JodagFix = #() fix(x*10^n)/10^n;
InfoRem = #() x - rem(x,10^-n);
% Time each function
T(k,1) = timeit(LuisRound);
T(k,2) = timeit(JodagFix);
T(k,3) = timeit(InfoRem);
end
% Plot results
figure
plot(2.^(1:K), T); legend('LuisRound', 'JodagFix', 'InfoRem');
grid on; xlabel('number of elements in x'); ylabel('time taken');
end
The resulting plot can be seen here:
According to this test, the fix method suggested by jodag is significantly quicker, so you should use something like this for a custom truncation function to n decimal places:
function y = trunc(x, n)
%% Truncate matrix/scalar x to n decimal places
if nargin < 2; n = 0; end; % default to standard fix behaviour if no n given
y = fix(x*10^n)/10^n; % return value truncated to n decimal places
end
tests:
>> trunc([pi, 10.45, 1.9], 4)
>> ans = [3.1415 10.4500 1.9000]
>> trunc([pi, 10.45, 1.9], 1)
>> ans = [3.1 10.4 1.9]
Use my round2 function:
function result = round2(x,accuracy)
if nargin<2, accuracy=1; end %default accuracy 1 same as 'round'
if accuracy<=0, accuracy=1e-6; end
result=round(x/accuracy)*accuracy;
end
Usual use: round2(3.14159,0.01) But you can also use it to round to every other multipler, for example: round2(number,2) will round to even number, or round2(number,10) etc..
1 + 1/(2^4) + 1/(3^4) + 1/(4^4) + ...
This is the infinite series that I'd like to get the sum value. So I wrote this code in MATLAB.
n = 1;
numToAdd = 1;
sum = 0;
while numToAdd > 0
numToAdd = n^(-4);
sum = sum + numToAdd;
n = n + 1;
end
disp(sum);
But I couldn't get the result because this code occurred an infinite loop. However, the code I write underneath -- it worked well. It took only a second.
n = 1;
oldsum = -1;
newsum = 0;
while newsum > oldsum
oldsum = newsum;
newsum = newsum + n^(-4);
n = n+1;
end
disp(newsum);
I read these codes again and googled for a while, but coudln't find out the critical point. What makes the difference between these two codes? Is it a matter of precision of double in MATLAB?
The first version would have to go down to the minimum value for a double ~10^-308, while the second will only need to go down to the machine epsilon ~10^-16. The epsilon value is the largest value x such that 1+x = 1.
This means the first version will need approximately 10^77 iterations, while the second only needs 10^4.
The problem boils down to this:
x = 1.23456789; % Some random number
xEqualsXPlusEps = (x == x + 1e-20)
ZeroEqualsEps = (0 == 1e-20)
xEqualsXPlusEps will be true, while ZeroEqualsEps is false. This is due to the way floating point arithmetic works. The value 1e-20 is smaller than the least significant bit of x, so x+1e-20 won't be larger than x. However 1e-20 is not considered equal to 0. In comparison to x, 1e-20 is relatively small, whereas in comparison to 0, 1e-20 is not small at all.
To fix this problem you would have to use:
while numToAdd > tolerance %// Instead of > 0
where tolerance is some small number greater than zero.
I want to remove duplicate entries from a vector on Matlab. The problem I'm having is that rounding errors are stopping the inbuilt Matlab function 'unique' from working properly. Ideally I'd like a way to set some sort of tolerance on the 'unique' function, or a small procedure that will remove the duplicates otherwise. If both the real and imaginary parts of two entries differ by less than 0.0001, then I'm happy to consider them equal. How can I do this?
Any help will be greatly appreciated. Thanks
A simple approximation would be to round the numbers and the use the indices returned by unique:
X = ... (input vector)
[b, i] = unique(round(X / (tolerance * (1 + i))));
output = X(i);
(you can probably replace b with ~ depending on your Matlab version).
it won't quite have the behaviour you want, since it is possible that two numbers are very close but will be rounded differently. I think you could mitigate this by doing:
X = ... (input vector)
[b, ind] = unique(round(X / (tolerance * (1 + i))));
X = X(ind);
[b, ind] = unique(round(X / (tolerance * (1 + i)) + 0.5 * (1 + i)));
X = X(ind);
This will round them twice, so any numbers that are exactly on a rounding boundary will be caught by the second unique.
There is still some messiness in this - some numbers will be affected as though the tolerance was doubled. But it might be sufficient for your needs.
The alternative is probably a for loop:
X = sort(X);
last = X(1);
indices = ones(numel(X), 1);
for j=2:numel(X)
if X(j) > last + tolerance * (1 + i)
last = X(j) + tolerance * (1 + i) / 2;
else
indices(j) = 0;
end
end
X = X(logical(indices));
I think this has the best behaviour you can expect (because you want to represent the vector by as few unique values as possible - when there are lots of numbers that differ by less than the tolerance level, there may be multiple ways of splitting them. This algorithm does so greedily, starting with the smallest).
I'm almost certain the below ill always assume any values closer than 1e-8 are equal. Simply replace 1e-8 with whatever value you want.
% unique function that assumes 1e-8 is equal
function [out, I] = unique(input, first_last)
threshold = 1e-8;
if nargin < 2
first_last = 'last';
end
[out, I] = sort(input);
db = diff(out);
k = find(abs(db) < threshold);
if strcmpi(first_last, 'last')
k2 = min(I(k), I(k+1));
elseif strcmpi(first_last, 'first')
k2 = max(I(k), I(k+1));
else
error('unknown flag option for unique, must be first or last');
end
k3 = true(1, length(input));
k3(k2) = false;
out = out(k3(I));
I = I(k3(I));
end
The following might serve your purposes. Given X, an array of complex doubles, it sorts them, then checks whether the absolute value differences between elements is within the complex tolerance, real_tol and imag_tol. It removes elements that satisfy this tolerance.
function X_unique = unique_complex_with_tolerance(X,real_tol,imag_tol)
X_sorted = sort(X); %Sorts by magnitude first, then imaginary part.
dX_sorted = diff(X_sorted);
dX_sorted_real = real(dX_sorted);
dX_sorted_imag = imag(dX_sorted);
remove_idx = (abs(dX_sorted_real)<real_tol) & (abs(dX_sorted_imag)<imag_tol);
X_unique = X_sorted;
X_unique(remove_idx) = [];
return
Note that this code will remove all elements which satisfy this difference tolerance. For example, if X = [1+i,2+2i,3+3i,4+4i], real_tol = 1.1, imag_tol = 1.1, then this function will return only one element, X_unique = [4+4i], even though you might consider, for example, X_unique = [1+i,4+4i] to also be a valid answer.