reset state variables on next file in `perl -n` one-liner - perl

I'm processing multiple files with find ... | xargs perl -ne and when I proceed to next file I need to reset some variables like gawk 'BEGINFILE {}' does.
As a workaround, I check that the current filename changed. Is there a cleaner way?
if ($oldARGV ne $ARGV) { $oldARGV = $ARGV; $var1=""; ... } ...

Using eof with no argument (Or with eof ARGV):
$ perl -nE 'say "Done with file $ARGV" if eof' *.txt
Done with file a.txt
Done with file b.txt

Related

How to rename the word from anywhere under my .html files without using File::Find in perl?

Here is my query the following is my directory structure .
`-- ka
|-- ka.html
`-- ka_tal
|-- tal_cation
| |-- tal_cation.html
| `-- ev1
| `--ka_ka_tal_tal_cation_v1.html
|
`-- ka_tal.html
Here every .html files have the word named as tevision i want to rename all words tevision to ev.so i had tried the following code.But it not works for me.
finddepth(sub {
return unless -d;
(my $new = $_) =~ s/tevision/ev/ or return;
rename $_, $new or warn "Err renaming $_ to $new in $File::Find::dir: $!";
}, ".");
How can i rename all words inside .html files using perl?
Inside the subroutine that is called by File::Find, the $_ variable will contain the name of the file that has been found. Your code seems to think it will contain the contents of the file. To get the contents of the file, you will need to open the file and read the contents. Then you will need to make your conversions and write the altered contents back to the original file.
This is all far easier using a combination of the Unix find command and Perl's command-line options. It probably helps to work out the solution for a single file and then use find to execute that command on all of the required files.
$ perl -i -pe 's/tevision/ev/' some_file.html
This takes a file (some_file.html), opens it and processes the file.
-i : this writes the converted input back to the original file
-p : this loops round all the lines in the file. Each one, in turn, is stored in $_ and the contents of $_ are printed after each iteration of the loop.
-e : this is the code you want to run for each line of the file.
We can then put that into a find command to get the full behaviour that you want.
$ find . -name "*.html" -exec perl -i -pe 's/tevision/ev/' {} \;
The following are more efficient because they don't launch as many instances of perl:
$ find . -name "*.html" -exec perl -i -pe 's/tevision/ev/' {} + # Requires GNU find
$ find . -name "*.html" -print0 | xargs -0 perl -i -pe 's/tevision/ev/'
Try this
my $path = "ka/";
find($path);
sub find{
my ($s) = #_;
foreach my $ma (glob "\Q$s\E/*")
{
if(-f $ma && $ma =~ m/\.html\z/)
{
system('perl', '-i', '-pe', 's/tevision/ev/', '--', $ma);
}
elsif(-d $ma)
{
find($ma)
}
}
}

Perl deleting "blank" lines from a csv file

I'm looking to delete blank lines in a CSV file, using Perl.
I'm not too sure how to do this, as these lines aren't exactly "blank" (they're just a bunch of commas).
I'd also like to save the output as a file of the same name, overwriting the original.
How could I go about doing this?
edit: I can't use modules or any source code due to network restrictions...
You can do this using a simple Perl one-liner:
perl -i -ne 'print unless /^[,\s]*$/' <filename>
The -n flag assumes this loop around your program:
while(<>) {
print unless /^[,\s]*$/;
}
and the -i flag means inplace and modifies your input file.
Note: If you are worried about losing your data with -i, you can specify -i.bak and perl will automatically write the original file to your <filename>.bak
More of a command line hack,
perl -i -ne 'print if /[^,\r\n]/' file.csv
If you want to put it inside a shell script you can do this ...
#!/bin/sh
$(perl -i -n -e 'print $_ unless ($_ =~ /^\,+$/);' $*)

How to add a line in every file in a directory with perl commandline

I want to add a line at the beginning of every file in a directory.
perl -i.bkp -p -e 'print "#include top_level.reset\n" if $. == 1' *.reset
But this command is updating only the first file in the directory. I think this is because $. is not resetting for next file.
How to modify all the files.
You are correct, $. is not reset between files when processing #ARGV. See perlvar. You can work around it by explicitly closing ARGV on EOF - see eof. But I would not bother, instead I would use the shell to iterate the files:
for f in *.reset; do perl -i.bkp -p -e 'print "#include top_level.reset\n" if $. == 1' $f; done
ls -1 *.reset | xargs -n 1 perl -i.bkp -p -e 'print "#include top_level.reset\n" if $. == 1'

Using command line to remove lines from text file

I have a text file and need to remove all lines that DO NOT contain http in them. Alternatively, it could just output all the files that DO contain http in them to the new file.
The name of my original file is list.txt and I need to generate a new file with a name like new.txt
I know that there are several ways to do this via command line, but what I'm really looking for is the quickest way since I need to do this with several files and each of them are a few gigs in size...
The quickest, shortest solution,
fgrep -v "http"
Of course, grep, egrep, awk, perl, etc make this more fungible.
Here is a short shell script. Edit "delhttp.sh" containing,
#!/bin/bash
if [ $# -eq 0 ] ; then
fgrep -v "http"
elif [ $# -eq 1 ] ; then
f1=${1:-"null"}
if [ ! -f $f1 ]; then echo "file $f1 dne"; exit 1; fi
fgrep -v "http" $f1 #> $f2
elif [ $# -eq 2 ]; then
f1=${1:-"null"}
if [ ! -f $f1 ]; then echo "file $f1 dne"; exit 1; fi
f2=${2:-"null"}
fgrep -v "http" $f1 > $f2
fi
Then make this file executable using,
chmod +x delhttp.sh
Here is a perl script (if you prefer), Edit "delhttp.pl" containing,
#!/bin/env perl
use strict;
use warnings;
my $f1=$ARGV[0]||"-";
my $f2=$ARGV[1]||"-";
my ($fh, $ofh);
open($fh,"<$f1") or die "file $f1 failed";
open($ofh,">$f2") or die "file $f2 failed";
while(<$fh>) { if( !($_ =~ /http/) ) { print $ofh "$_"; } }
Again, make this file executable using,
chmod +x delhttp.pl
perl -i -lne 'print if(/http/)' your_file
This above command will delete all the lines from the file if they do not have http.
If you insist on keeping the original file backup, the you can anyhow give and option of ".bak" like mentioned below:
perl -i.bak -lne 'print if(/http/)' your_file
By this your_file.bak will be generated which is nothing but a copy of the original file and original file will be modified according to your need.
Also you can use awk:
awk '/http/' your_file
This will out put to the console. You can anyhow use '>' to store the output in a new file.
You could use grep. Using -v inverts the sense of matching, to select non-matching lines.
grep -v 'http' list.txt
Using Perl one-liner:
perl -ne '/^(?:(?!http).)*$/ and print' list.txt > new.txt

Only print matching lines in perl from the command line

I'm trying to extract all ip addresses from a file. So far, I'm just using
cat foo.txt | perl -pe 's/.*?((\d{1,3}\.){3}\d{1,3}).*/\1/'
but this also prints lines that don't contain a match. I can fix this by piping through grep, but this seems like it ought to be unnecessary, and could lead to errors if the regexes don't match up perfectly.
Is there a simpler way to accomplish this?
Try this:
cat foo.txt | perl -ne 'print if s/.*?((\d{1,3}\.){3}\d{1,3}).*/\1/'
or:
<foo.txt perl -ne 'print if s/.*?((\d{1,3}\.){3}\d{1,3}).*/\1/'
It's the shortest alternative I can think of while still using Perl.
However this way might be more correct:
<foo.txt perl -ne 'if (/((\d{1,3}\.){3}\d{1,3})/) { print $1 . "\n" }'
If you've got grep, then just call grep directly:
grep -Po "(\d{1,3}\.){3}\d{1,3}" foo.txt
You've already got a suitable answer of using grep to extract the IP addresses, but just to explain why you were seeing non-matches being printed:
perldoc perlrun will tell you about all the options you can pass Perl on the command line.
Quoting from it:
-p causes Perl to assume the following loop around your program, which makes it
iterate over filename arguments somewhat like sed:
LINE:
while (<>) {
... # your program goes here
} continue {
print or die "-p destination: $!\n";
}
You could have used the -n switch instead, which does similar, but does not automatically print, for example:
cat foo.txt | perl -ne '/((?:\d{1,3}\.){3}\d{1,3})/ and print $1'
Also, there's no need to use cat; Perl will open and read the filenames you give it, so you could say e.g.:
perl -ne '/((?:\d{1,3}\.){3}\d{1,3})/ and print $1' foo.txt
ruby -0777 -ne 'puts $_.scan(/((?:\d{1,3}\.){3}\d{1,3})/)' file