replace small numbers in Perl - perl

In a text file, I have many lines looking like (a,b,c) where a, b, and c are double precision real numbers, for instance (8.27605704077856,0.505526531790625,1.15577754382534e-05). Is there a simple way to replace numbers smaller than 10e-4 by 0 in Perl?
Edit: For instance, the text file to be treated looks like:
\plotinstruction[color,style,width]
points{
(8.27,0.5,1.1e-05)
(8.26,1,4.1e-06)
(8.25,1.5,3e-06)
}
and I want to write in a new file:
\plotinstruction[color,style,width]
points{
(8.27,0.5,0)
(8.26,1,0)
(8.25,1.5,0)
}

Perhaps I'm missing something, but perhaps use of map would help?
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #values = (8.27605704077856, 0.505526531790625, 1.15577754382534e-05);
my #filtered_values = map(($_ > 1e-4) ? $_ : 0, #values);
print Dumper \#filtered_values;
Results:
$VAR1 = [
'8.27605704077856',
'0.505526531790625',
0
];
To parse input, you could use a regular expression to extract a comma-separated string of numbers, using split on that to get a Perl list to run map upon.

You can write:
perl -pwe 's/\d[\d.e+-]+/$& < 0.001 && $& > -0.001 ? "0" : $&/ge' < INPUT > OUTPUT
(-p means to read in the input, one line at a time, into $_, run the program, print out $_, and loop again; -w enables warnings; -e means that the program is specified directly as a command-line argument; s/// is a regex-substitution; /g means that it's a "global" substitution; and /e means that the replacement-text should be treated as a full Perl expression, rather than as a string with mere variable interpolation.)

Related

How to grep square brackets in perl

I am trying to grep [0](including square brackets) in a file using perl, I tried following code
my #output = `grep \"\[0\]\" log `;
But instead of returning [0], it is giving output where it matches 0
Your problem is that you need to escape the [ and ] twice, as [ ... ] has a special meaning in regexes (it defines a character class).
#!/usr/bin/perl
use strict;
use warnings;
my #output = `grep "\\[0\\]" log `;
print for #output;
But you really don't need to use the external grep command. Perl is great at text processing.
#!/usr/bin/perl
use strict;
use warnings;
while (<>) {
print if /\[0\]/;
}
My solution reads from any file whose name is given as an argument to the program (or from STDIN).

How to store value from cut command into a Perl array?

my #up = `cat abc.txt|head -2|tail -1|cut -d' ' -f1-3`;
Instead of storing the individual fields in the array. It's storing the entire output as a string in the first element.
This is the output I am getting
$up[0] = 'xxx 12 234'
I want this
#up = ('xxx', 12, 234)
|
It looks like you want the first three space-delimited fields of the second line of file abc.txt
The problem is that backticks will return one line of output in each element of the array, and because cut prints all three fields on a single line, they appear as a single array element.
You could split the value again inside Perl, but when you have the whole of the Perl language available, it's wasteful to use the shell to do something so simple and you should do everything in Perl
This program will do as you ask. I've used Data::Dump only so that you can verify that the contents of #up are as you wanted
use strict;
use warnings 'all';
use Data::Dump;
my #up = do {
open my $fh, '<', 'abc.txt' or die $!;
<$fh>; # Skip one line
(split ' ', <$fh>)[0 .. 2];
};
dd \#up;
output
["xxx", 12, 234]
You can either split the result by whitespaces:
my #up = split(/\s+/, `cat abc.txt ...`);
Or prior you can set input record separator to space. This one however is not as flexible, it's just simple string so in case there are two spaces in a row it will treat it as empty field in the middle:
local $/ = " ";
my #up = `cat abc.txt ...`;

How do I extract lines between two strings

I am an absolute beginner in perl and I am trying to extract lines of text between 2 strings on different lines but without success. It looks like I`m missing something in my code. The code should print out the file name and the found strings. Do you have any idea where could be the problem ? Many thanks indeed for your help or advice. Here is the example:
*****************
example:
START
new line 1
new line 2
new line 3
END
*****************
and my script:
use strict;
use warnings;
my $command0 = "";
opendir (DIR, "C:/Users/input/") or die "$!";
my #files = readdir DIR;
close DIR;
splice (#files,0,2);
open(MYOUTFILE, ">>output/output.txt");
foreach my $file (#files) {
open (CHECKBOOK, "input/$file")|| die "$!";
while ($record = <CHECKBOOK>) {
if (/\bstart\..\/bend\b/) {
print MYOUTFILE "$file;$_\n";
}
}
close(CHECKBOOK);
$command0 = "";
}
close(MYOUTFILE);
I suppose that you are trying to use a flip-flop here, which might work well for your input, but you've written it wrong:
if (/\bstart\..\/bend\b/) {
A flip-flop (the range operator) uses two statements, separated by either .. or .... What you want is two regexes joined with ..:
if (/\bSTART\b/ .. /\bEND\b/)
Of course, you also want to match the case (upper), or use the /i modifier to ignore case. You might even want to use beginning of line anchor ^ to only match at the beginning of a line, e.g.:
if (/^START\b/ .. /^END\b/)
You should also know that your entire program can be replaced with a one-liner, such as
perl -ne 'print if /^START\b/ .. /^END\b/' input/*
Alas, this only works for linux. The cmd shell in Windows does not glob, so you must do that manually:
perl -ne "BEGIN { #ARGV = map glob, #ARGV }; print if /^START\b/ .. /^END\b/" input/*
If you are having troubles with the whole file printing no matter what you do, I think the problem lies with your input file. So take a moment to study it and make sure it is what you think it is, for example:
perl -MData::Dumper -e"$Data::Dumper::Useqq = 1; print Dumper $_;" file.txt
If you're matching a multi-line string, you might need to tell the regexp about it:
if (/\bstart\..\/bend\b/s) {
note the s after the regex.
Perldoc says:
s
Treat string as single line. That is, change "." to match any
character whatsoever, even a newline, which normally it would not
match.

Perl - count number of columns per row in a csv file

I want to count the number of columns in a row for a CSV file.
row 1 10 columns
row 2 11 columns
etc.
I can print out the value of the last column, but I really just want a count per row.
perl -F, -lane "{print #keys[$_].$F[$_] foreach(-1)}" < testing.csv
I am on a windows machine
Thanks.
If you have a proper csv file, it can contain embedded delimiters (e.g. 1,"foo,bar",2), in which case a simple split will not be enough. You can use the Text::CSV module fairly easily with a one-liner like this:
Copy/paste version:
perl -MText::CSV -lwe"my $c=Text::CSV->new({sep_char=>','}); while($r=$c->getline(*STDIN)) { print scalar #$r }" < sorted.csv
Readable version:
perl -MText::CSV # use Text::CSV module
-lwe # add newline to print, use warnings
"my $c = Text::CSV->new(); # set up csv object
while( $r = $c->getline(*STDIN) ) { # get lines from stdin
print scalar #$r # print row size
}" < sorted.csv # input file to stdin
If your input can be erratic, Text::CSV->getline might choke on corrupted lines (the while loop is ended), in which case it may be safer to use plain parsing:
perl -MText::CSV -nlwe"
BEGIN { $r = Text::CSV->new() };
$r->parse($_);
print scalar $r->fields
" comma.csv
Note that in this case we use a different input method. This is because while getline() requires a file handle, parse() does not. Since the diamond operator uses either ARGV or STDIN depending on your argument, I find it is better to be explicit.
If you don't have commas as part of the fields, you can split the line and count the number of fields
#! /usr/bin/perl
use strict;
use warnings;
my #cols = split(',', $_);
my $n = #cols;
print "row $. $n columns\n";
you can call this
perl -n script.pl testing.csv

Is 999...9 a real number in Perl?

sub is_integer {
defined $_[0] && $_[0] =~ /^[+-]?\d+$/;
}
sub is_float {
defined $_[0] && $_[0] =~ /^[+-]?\d+(\.\d+)?$/;
}
For the code mentioned above, if we give input as 999999999999999999999999999999999999999999, it is giving output as not real number.
Why it is behaving like that?
I forgot to mention one more thing:
If I am using this code for $x as the above value:
if($x > 0 || $x <= 0 ) {
print "Real";
}
Output is real.
How is this possible?
$ perl -e 'print 999999999999999999999999999999999999999999'
1e+42
i.e. Perl uses scientific representation for this number and that is why your regexp doesn't match.
Use the looks_like_number function from Scalar::Util (which is a core module).
use Scalar::Util qw( looks_like_number );
say "Number" if looks_like_number 999999999999999999999999999999999999999999;
# above prints "Number"
Just to add one more thing. As others have explained, the number you are working with is out of range for a Perl integer (unless you are on a 140 bit machine). Therefore, the variable will be stored as a floating point number. Regular expressions operate on strings. Therefore, the number is converted to its string representation before the regular expression operates on it.
Others have explained what is going on: out of the box, Perl can't handle numbers that large without using scientific notation.
If you need to work with large numbers, take a look at bignum or its components, such as Math::BigInt. For example:
use strict;
use warnings;
use Math::BigInt;
my $big_str = '900000000000000000000000000000000000000';
my $big_num = Math::BigInt->new($big_str);
$big_num ++;
print "Is integer: $big_num\n" if is_integer($big_num);
sub is_integer {
defined $_[0] && $_[0] =~ /^[+-]?\d+$/;
}
Also, you may want to take a look at bignum in the Perl documentation.