How to remove all lines except .c extention(at last) lines using perl scripting - perl

I've a string $string which has got list of lines, some ending with *.c, *.pdf,etc and few without any extensions(these are directories). I need to remove all lines except *.c lines. How can i do that using regular expression? I've written to get removed *.c files as below but how to do a not of it?
next if $line =~ /(\.c)/i;
Any ideas.
thanks,
Sharath

Use unless instead of if to reverse the sense of the condition.
next unless $line =~ /\.c$/i;
or simply invert the test:
next if $line !~ /\.c$/i;
Also, you don't need parentheses around the regexp, and you need $ to anchor it to the end of the line.

Related

How to remove newline from the end of a file using Perl

I have a file that reads like this:
dog cat mouse
apple orange pear
red yellow green
There is a tab \t separating the words on each row, and a newline \n separating each of the rows. Below the last line, red yellow green there is a blank line due to a newline \n after green.
I would like to use Perl to remove the newline.
I have seen a few articles like this How can I delete a newline if it is the last character in a file? that give solutions for Perl, but I would like to do this in hard code so that I can incorporate it into my Perl script.
I don't know if this might be possible using chomp, or if chomp works on each line separately (I would like to keep the newline between lines).
Also I have seen previously comments that suggest maintaining a newline at the end of a file because Unix commands work better when a file ends with a newline. However, I have created a script which relies on input files not ending with a newline, therefore I really feel removing the newlines is necessary for my work.
You can try this:
perl -pe 'chomp if eof' file.txt
Here is another simple way, if you need it in a script:
open $fh, "file.txt";
#lines=<$fh>; # read all lines and store in array
close $fh;
chomp $lines[-1]; # remove newline from last line
print #lines;
Or something like this (in script), as suggested by jnhc for the command line:
open $fh, "file.txt";
while (<$fh>) {
chomp if eof $fh;
print;
}
close $fh;

Need further understanding in the next unless code I'm reading

I need help with 2 things on this code that I'm reading. First, is I keep seeing this inside of while loop to read a file:
wile(<filename>){
next unless (/\w/);
chomp;
s/^\s*//;
s/^\s*$//;
my($name, $datatype, $io, $dummy) = split /\s*,\s*/, $_, 4;
}
So, I'm wondering what that is doing? Because there are commas in the same line being read, so wouldn't the commas make it go to the next iteration? SO how would it split the lines if it is going to another iteration when the commas are being read?
Another one I'm stomped by is:
while (<AP>) {
chomp;
s/
//g;
}
I have no idea what that code is actually substituting...
Thanks!
The first snippet:
Reads a line from a filehandle called filename. This is a really bad name for a filehandle
It skips the processing if there is not even a single \w (word character) on the line.
The next unless (/\w/); is the same as next if not (/\w/). Note that there is no need for parenthesis -- next unless /\w/; is fine.
A word character is, from perlretut
\w matches a word character (alphanumeric or _), not just [0-9a-zA-Z_] but also digits and characters from non-roman scripts
It removes (only) the newline with chomp. Then it removes leading spaces, if any
It removes blank lines, the ones with only spaces on them
It splits the line by commas, allowing that they have spaces before and/or after. It also limits the number of terms returned, to 4. This means that it returns the first three comma-separated fields, and then all the rest as one string in the last element of the list
The second snippet is really bad, whatever it is meant to do. (Remove spaces on the line?)
Comments
It is far better to use lexical filehandles, rather than barenames. So you'd open a file as
open my $fh, '<', $filename or die "Can't open $filename: $!";
and read it by while (my $line = <$fh>) or by while (<$fh>).
Normally you'll see lines skipped if they have nothing other than spaces
next unless /\S/; # or
next if /^\s*$/;
Using \w also skips lines with some characters (other than what is matched by \w), which means that one had better be very sure that those are fine to skip.
Here it may be meant to skip a line with commas but no \w (comma is not matched by \w), for which split would return spaces (or empty strings) in a list. I find this a bit hidden and fragile. I'd drop lines with spaces only, and handle possible loose commas in processing. As it stands it doesn't help with ,,a, anyway, what yields ('', '', 'a'). So checking is probably needed in any case.
Note that altogether this code leaves trailing spaces. When split is invoked with the optional fourth argument it keeps all spaces, and they haven't been removed otherwise.

Print Line By Line

I've been trying to work on a lyrical bot for my server, but before I started to work on it, I wanted to give it a test so I came up with this script using the Lyrics::Fetcher module.
use strict;
use warnings;
use Lyrics::Fetcher;
my ($artist, $song) = ('Coldplay', 'Adventures Of A Lifetime');
my $lyrics = Lyrics::Fetcher->fetch($artist, $song, [qw(LyricWiki AstraWeb)]);
my #lines = split("\n\r", $lyrics);
foreach my $line (#lines) {
sleep(10);
print $line;
}
This script works fine, it grabs the lyrics and prints it out in a whole(which is not what I'm looking for).
I was hoping to achieve a line by line print of the lyrics every 10 seconds. Help please?
Your call to split looks suspicious. In particular the regex "\n\r". Note, the first argument to split is always interpreted as a regex regardless of whether you supply a quoted string.
On Unix systems the line ending is typically "\n". On DOS/Windows it's "\r\n" (the reverse of what you have). On ancient Macs it was "\r". To match all thre you could do:
my #lines = split(/\r\n|\n|\r/, $lyrics);
You will need to enable autoflush, otherwise the lines will just be buffered and printed when the buffer is full or when the program terminates
STDOUT->autoflush;
You can use the regex generic newline pattern \R to split on any line ending, whether your data contains CR, LF, or CR LF. This feature is available only in Perl v5.10 or better
my #lines = split /\R/, $lyrics;
And you will need to print a newline after each line of lyrics, because the split will have removed them
print $line, "\n";

Perl: How to remove spaces and blank lines in one pass

I have got 2 perl scripts, first one removes blank lins from a file and the second one removes all spaces inside a file. I wonder, if it's possible to connect both of these regular expressions inside 1 script?
For spaces, i have used this regsub: $str =~ tr/ //d;
and for Blank lines, I have used this regexp
while (<$file>) {
if (/\S/){
print $new_file $_; }}
It should be really easy: just add tr/ //d before the if line.
Note: It will remove lines containing spaces only, too. If you want to keep them (but transliterated to empty lines), insert the transliteration before the print line.
If you wish to trim the end of the line that contains space,
you might want it to work like this:
perl -pi -e 's/\s*$/\n/' f1 f2 f3 #UNIX file format
perl -pi -e 's/\s*$/\r\n/' f1 f2 f3 #DOS file format

In perl pattern matching..how to exclude the \n character from pattern

I am new to perl and writing my first few programs and using its pattern matching abilities. I am reading a file into array like this:
#list=<file>
Then indexing each line of array by $list[0..9] etc, and when I match it against a pattern, the $list[0] includes \n character, hence the match fails. So if ($string =~ $list[0]) fails though without \n character in pattern it would match.
How do I tell pattern matcher to not consider the \n character from pattern?
Thanks
You can shave the line ends from the array after reading:
#lines = …;
chomp #lines;
Now #lines contains the lines without line ends. See perldoc chomp for details.
If you want to remove the \n from your lines you can:
chomp $list[0]
see perldoc -f chomp for the details.
This is a good opportunity to get to know how Perl modules work.
You can for example use Perl6::Slurp which will both a) parse the file b) put the contents in an array c) remove the newline characters for you.
For example:
use Perl6::Slurp;
my #lines = slurp '<:utf8', 'filename', {chomp=>"\n"}
This will match with the \n:
if ( $list[0] =~ "$string\n")
Or if you want the \n to be optional:
if ( $list[0] =~ /$string\n?/ )