How can I set the level of precision for Raku's sqrt? - perl

With Perl, one could use bignum to set the level of precision for all operators. As in:
use bignum ( p => -50 );
print sqrt(20); # 4.47213595499957939281834733746255247088123671922305
With Raku I have no problems with rationals since I can use Rat / FatRat, but I don't know how to use a longer level of precision for sqrt
say 20.sqrt # 4.47213595499958

As stated on Elizabeth's answer, sqrt returns a Num type, thus it has limited precision. See Elizabeth's answer for more detail.
For that reason I created a raku class: BigRoot, which uses newton's method and FatRat types to calculate the roots. You may use it like this:
use BigRoot;
# Can change precision level (Default precision is 30)
BigRoot.precision = 50;
my $root2 = BigRoot.newton's-sqrt: 2;
# 1.41421356237309504880168872420969807856967187537695
say $root2.WHAT;
# (FatRat)
# Can use other root numbers
say BigRoot.newton's-root: root => 3, number => 30;
# 3.10723250595385886687766242752238636285490682906742
# Numbers can be Int, Rational and Num:
say BigRoot.newton's-sqrt: 2.123;
# 1.45705181788431944566113502812562734420538186940001
# Can use other rational roots
say BigRoot.newton's-root: root => FatRat.new(2, 3), number => 30;
# 164.31676725154983403709093484024064018582340849939498
# Results are rounded:
BigRoot.precision = 8;
say BigRoot.newton's-sqrt: 2;
# 1.41421356
BigRoot.precision = 7;
say BigRoot.newton's-sqrt: 2;
# 1.4142136
In general it seems to be pretty fast (at least compared to Perl's bigfloat)
Benchmarks:
|---------------------------------------|-------------|------------|
| sqrt with 10_000 precision digits | Raku | Perl |
|---------------------------------------|-------------|------------|
| 20000000000 | 0.714 | 3.713 |
|---------------------------------------|-------------|------------|
| 200000.1234 | 1.078 | 4.269 |
|---------------------------------------|-------------|------------|
| π | 0.879 | 3.677 |
|---------------------------------------|-------------|------------|
| 123.9/12.29 | 0.871 | 9.667 |
|---------------------------------------|-------------|------------|
| 999999999999999999999999999999999 | 1.208 | 3.937 |
|---------------------------------------|-------------|------------|
| 302187301.3727 / 123.30219380928137 | 1.528 | 7.587 |
|---------------------------------------|-------------|------------|
| 2 + 999999999999 ** 10 | 2.193 | 3.616 |
|---------------------------------------|-------------|------------|
| 91200937373737999999997301.3727 / π | 1.076 | 7.419 |
|---------------------------------------|-------------|------------|
If want to implement your own sqrt using newton's method, this the basic idea behind it:
sub newtons-sqrt(:$number, :$precision) returns FatRat {
my FatRat $error = FatRat.new: 1, 10 ** ($precision + 1);
my FatRat $guess = (sqrt $number).FatRat;
my FatRat $input = $number.FatRat;
my FatRat $diff = $input;
while $diff > $error {
my FatRat $new-guess = $guess - (($guess ** 2 - $input) / (2 * $guess));
$diff = abs($new-guess - $guess);
$guess = $new-guess;
}
return $guess.round: FatRat.new: 1, 10 ** $precision;
}

In Rakudo, sqrt is implemented using the sqrt_n NQP opcode. Which indicates it only supports native nums (because of the _n suffix). Which implies limited precision.
Internally, I'm pretty sure this just maps to the sqrt functionality of one of the underlying math libraries that MoarVM uses.
I guess what we need is an ecosystem module that would export a sqrt function based on Rational arithmetic. That would give you the option to use higher precision sqrt implementations at the expense of performance. Which then in turn, might turn out to be interesting enough to integrate in core.

Related

Expression for setting lowest n bits that works even when n equals word size

NB: the purpose of this question is to understand Perl's bitwise operators better. I know of ways to compute the number U described below.
Let $i be a nonnegative integer. I'm looking for a simple expression E<$i>1 that will evaluate to the unsigned int U, whose $i lowest bits are all 1's, and whose remaining bits are all 0's. E.g. E<8> should be 255. In particular, if $i equals the machine's word size (W), E<$i> should equal ~02.
The expressions (1 << $i) - 1 and ~(~0 << $i) both do the right thing, except when $i equals W, in which case they both take on the value 0, rather than ~0.
I'm looking for a way to do this that does not require computing W first.
EDIT: OK, I thought of an ugly, plodding solution
$i < 1 ? 0 : do { my $j = 1 << $i - 1; $j < $j << 1 ? ( $j << 1 ) - 1 : ~0 }
or
$i < 1 ? 0 : ( 1 << ( $i - 1 ) ) < ( 1 << $i ) ? ( 1 << $i ) - 1 : ~0
(Also impractical, of course.)
1 I'm using the strange notation E<$i> as shorthand for "expression based on $i".
2 I don't have a strong preference at the moment for what E<$i> should evaluate to when $i is strictly greater than W.
On systems where eval($Config{nv_overflows_integers_at}) >= 2**($Config{ptrsize*8}) (which excludes one that uses double-precision floats and 64-bit ints),
2**$i - 1
On all systems,
( int(2**$i) - 1 )|0
When i<W, int will convert the NV into an IV/UV, allowing the subtraction to work on systems with the precision of NVs is less than the size of UVs. |0 has no effect in this case.
When i≥W, int has no effect, so the subtraction has no effect. |0 therefore overflows, in which case Perl returns the largest integer.
I don't know how reliable that |0 behaviour is. It could be compiler-specific. Don't use this!
use Config qw( %Config );
$i >= $Config{uvsize}*8 ? ~0 : ~(~0 << $i)
Technically, the word size is looked up, not computed.
Fun challenge!
use Devel::Peek qw[Dump];
for my $n (8, 16, 32, 64) {
Dump(~(((1 << ($n - 1)) << 1) - 1) ^ ~0);
}
Output:
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK)
IV = 255
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK)
IV = 65535
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK)
IV = 4294967295
SV = IV(0x7ff60b835508) at 0x7ff60b835518
REFCNT = 1
FLAGS = (PADTMP,IOK,pIOK,IsUV)
UV = 18446744073709551615
Perl compiled with:
ivtype='long', ivsize=8, nvtype='double', nvsize=8
The documentation on the shift operators in perlop has an answer to your problem: use bigint;.
From the documentation:
Note that both << and >> in Perl are implemented directly using << and >> in C. If use integer (see Integer Arithmetic) is in force then signed C integers are used, else unsigned C integers are used. Either way, the implementation isn't going to generate results larger than the size of the integer type Perl was built with (32 bits or 64 bits).
The result of overflowing the range of the integers is undefined because it is undefined also in C. In other words, using 32-bit integers, 1 << 32 is undefined. Shifting by a negative number of bits is also undefined.
If you get tired of being subject to your platform's native integers, the use bigint pragma neatly sidesteps the issue altogether:
print 20 << 20; # 20971520
print 20 << 40; # 5120 on 32-bit machines,
# 21990232555520 on 64-bit machines
use bigint;
print 20 << 100; # 25353012004564588029934064107520

Converting ASCII decimal

I receive two equivalent strings from my database depending on whether I ask for it in binary or text format.
Binary is hexadecimal... 4d4d002a0000100801010101010101...(134916 characters)
Text is (I think ASCII decimal)... //x3464346430303261303030... (269832 characters)
I can convert the hexadecimal version into a byte array and ultimately an NSData (67458 bytes):
let data = NSMutableData(capacity: self.characters.count / 2)
for var index = self.startIndex; index < self.endIndex; index = index.advancedBy(2) {
let byteString = self.substringWithRange(Range<String.Index>(start: index, end: index.advancedBy(2)))
let byteUInt = UInt8(strtoul(byteString, nil, 16))
data?.appendBytes([UInt8]([byteUInt]), length: 1)
}
But I am having no such luck with the text version. Tried parsing it a million different ways and I can't come up with an equivalent conversion.
If it matters, the database is PostgreSQL v9.5 and the data in text format is returned as a null-terminated character string (char *).
Any insight would be greatly appreciated.
It appears that "ASCII representation" is a hex encoding of the hex encoding, so you should be able to produce a proper result by applying the same conversion twice:
34 | 64 | 34 | 64 | 30 | 30 | 32 | 61 | 30 | 30 | 30 | -- Original
4 | d | 4 | d | 0 | 0 | 2 | a | 0 | 0 | 0 | -- ASCII conversion

PowerShell formating numbers by variables

I have integer values: 3 60 150 1500 and float values 1.23354, 1.234, 1.234567...
I calculate the number of digits of the biggest integer:
$nInt = [System.Math]::Ceiling([math]::log10($maxInt))
# nInt = 4
and in another way the biggest number of dec. behind the decimal point of the float-variable: $nDec = 6
How can I format a print out that all integer do have the same string-length with leading spaces?
|1500
| 500
| 60
| 3
And all float with the same string-length as well?
1.234567|
1.23354 |
1.234 |
The | is just to mark my 'point of measure'.
Of course I have to choose a character-set where all characters do have the same pixex-size.
I am thinking of formatting by "{0:n}" or $int.ToString(""), but I can't see how to use this.
Try PadLeft or PadRight. For example, for your integers:
$maxInt.ToString().PadLeft($nInt.ToString().Length, ' ')

Perl CPAN module fisher exact test

Is there any module in CPAN that can provide a method to compute the Fishers exact tests?
example in R:
in a 2x2 contingency table like:
17 12
8842559 10003821
fisher.test(matrix(data = c(17,8842559,12,10003821), nrow = 2))
Fisher's Exact Test for Count Data
data: counts
p-value = 0.2642
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.7213591 3.6778630
sample estimates:
odds ratio
1.602697
I used Text::NSP::Measures::2D::Fisher module, but I am not sure it does the same as above.
use Text::NSP::Measures::2D::Fisher::twotailed;
my $npp = 10003821;
my $n1p = 8842559;
my $np1 = 12;
my $n11 = 17;
my $twotailed_value = calculateStatistic(
n11 => $n11,
n1p => $n1p,
np1 => $np1,
npp => $npp,
);
if( (my $errorCode = getErrorCode()) ) {
print STDERR $errorCode, " - ", getErrorMessage();
} else {
print getStatisticName, "value for bigram is ", $twotailed_value, "\n";
}
but it does not give me anything
The matrix differ from R to perl. You can't use the same matrix !
for perl this is the matrix (into brackets):
word2 ~word2
word1 [n11] n12 | [n1p]
~word1 n21 n22 | n2p
--------------
[np1] np2 [npp]
For R this is the matrix (into brackets):
word2 ~word2
word1 [n11] [n12] | n1p
~word1 [n21] [n22] | n2p
--------------
np1 np2 npp

Is there a way to improve this ANTLR 3 Grammar for positive and negative integer and decimal numbers?

Is there a way to express this in a less repeative fashion with the optional positive and negative signs?
What I am trying to accomplish is how to express optionally provide positive + ( default ) and negative - signs on number literals that optionally have exponents and or decimal parts.
NUMBER : ('+'|'-')? DIGIT+ '.' DIGIT* EXPONENT?
| ('+'|'-')? '.'? DIGIT+ EXPONENT?
;
fragment
EXPONENT : ('e' | 'E') ('+' | '-') ? DIGIT+
;
fragment
DIGIT : '0'..'9'
;
I want to be able to recognize NUMBER patterns, and am not so concerned about arithmetic on those numbers at that point, I will later, but I am trying to understand how to recognize any NUMBER literals where numbers look like:
123
+123
-123
0.123
+.123
-.123
123.456
+123.456
-123.456
123.456e789
+123.456e789
-123.456e789
and any other standard formats that I haven't thought to include here.
To answer your question: no, there is no way to improve this AFAIK. You could place ('+' | '-') inside a fragment rule and use that fragment, just like the exponent-fragment, but I wouldn't call it a real improvement.
Note that unary + and - signs generally are not a part of a number-token. Consider the input source "1-2". You don't want that to be tokenized as 2 numbers: NUMBER[1] and NUMBER[-2], but as NUMBER[1], MINUS[-] and NUMBER[2] so that your parser contains the following:
parse
: statement+ EOF
;
statement
: assignment
;
assignment
: IDENTIFIER '=' expression
;
expression
: addition
;
addition
: multiplication (('+' | '-') multiplication)*
;
multiplication
: unary (('*' | '/') unary)*
;
unary
: '-' atom
| '+' atom
| atom
;
atom
: NUMBER
| IDENTIFIER
| '(' expression ')'
;
IDENTIFIER
: ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | DIGIT)*
;
NUMBER
: DIGIT+ '.' DIGIT* EXPONENT?
| '.'? DIGIT+ EXPONENT?
;
fragment
EXPONENT
: ('e' | 'E') ('+' | '-') ? DIGIT+
;
fragment
DIGIT
: '0'..'9'
;
and addition will therefor match the input "1-2".
EDIT
An expression like 111.222 + -456 will be parsed as this:
and +123 + -456 as: