How can I remove the timestamp from a filename in Perl? - perl

I have a file which has a line in it as:
/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log`
I need a script which would read this line and remove the time stamp from it, that is:
10.01.21_16.54.18
The script should print the filename without the timestamp and holding the full path, that is:
/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut.log`
Please help as I'm unable to pattern match and output the file path without the timestamp.

echo "/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log" |
perl -pe "s/_\d\d\.\d\d\.\d\d_\d\d\.\d\d\.\d\d//;"

$ perl -e 's{_\d{2}\.\d{2}.\d{2}_\d{2}\.\d{2}.\d{2}}{} and print for #ARGV' /hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log

Path shortened to prevent scrolling:
$ cat paths
CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log
$ perl -pe 's/(_(\d\d(\.\d\d){2})){2}\.log$/.log/' paths
CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut.log
The timestamp is made up of 2 sequences that look like _##.##.##. The subsequences end with 2 sequences of .##. These are the roles of the {2} quantifiers.

while(<>){
#s = split /\// ;
$fullpath=join("/",splice #s , 0, $#s);
#a = split /[_.]/ ,$s[-1];
$newfile="$fullpath/$a[0].$a[-1]";
print $newfile."\n";
}

You can use the following coding
use strict;
use warnings;
my $var; $var=/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log";
$var=~s/_\d\d\.\d\d\.\d\d//g;
# $var=~s/_10\.01\.21_16\.54\.18//g; # You can use this way also
print "$var\n";

Related

How to read the textfile by command line arguments and print the column by using perl?

How to read the text file using perl command line arguments and print the third column using perl?
I'm struck with taking input from the command line and printing the required column. Help me to choose the right way to reach the expected output.
Code which I wrote to take command line input:(map.pl)
use strict;
use warnings 'all';
use Getopt::Long 'GetOptions';
my #files=GetOptions(
'p_file=s' => \my $p_file,
);
print $p_file ? "p_file = $p_file\n" : "p_file\n";
Output I got for above code:
perl map.pl -p_file cat.txt
p_file = cat.txt
cat.txt:(Input file)
ADG:YUF:TGH
UIY:POG:YTH
GHJUR:"HJKL:GHKIO
Expected output:
TGH
YTH
GHKIO
Perl can automatically read files whose names are provided as command line arguments. The command below should produce your expected output
perl -F: -le 'print $F[2]' cat.txt
-F: turns on autosplit mode, sets the field separator to : and loops over lines of input files. -l handles line endings during input and output. The code after e flag ('print $F[2]' prints 3rd field) is executed for each line of file. Find out more by reading perldoc perlrun.
You'd need to read the file and split the lines to get the columns, and print the required column. Here's a demo code snippet, using the perl -s switch to parse command line arguments. Run like this ./map.pl -p_file=cat.txt
#!/usr/bin/perl -s
use strict;
use warnings;
use vars qw[$p_file];
die("You need to pass a filename as argument") unless(defined($p_file));
die("Filename ($p_file) does not exists") unless(-f $p_file);
print "Proceeding to read file : $p_file\n\n";
open(my $fh,'<',$p_file) or die($!);
while(chomp(my $line = <$fh>)) {
next unless(defined($line) && $line);
my #cols = split(/:/,$line);
print $cols[-1],"\n";
}
close($fh);

Move last character of line to specific column -- sed? awk?

I need to replace all lines ending with specific character (say, &) such that this character should be in certain column (say, 80).
Which tool is best?
I have started thinking about sed:
sed 's/\(.*\)&/\1 <what should be here??> &/'
but cannot understand how to replace with variable number of spaces such that & goes to column 80.
Thanks!
Use the /e switch to s/// that tells Perl to evaluate the replacement portion to compute the result.
#! /usr/bin/env perl
use strict;
use warnings;
while (<>) {
s/^(.*)(&)$/$1 . " " x (79 - length $1) . $2/e;
print;
}
Sample run:
$ echo -e 'foo&\n&\nbar &\nbaz' | ./align-ampersands
foo &
&
bar &
baz
If your input contains TAB characters, you will need to use more sophisticated processing.
Not sure if I understand your question correctly but you can try something like (assuming your file is space delimited):
awk '/&$/ {for(i=1;i<=NF;i++) $i=(i==80)?"& "$i:$i}1' yourFile
Awk and Perl will both work. Both have printf and substr:
#! /usr/bin/env perl
use warnings;
use strict;
my $string = "this is some text &";
my $last_char = substr($string, -1, 1);
$string = substr ($string, 0, length ($string ) - 1);
printf qq(%-79.79s%s\n), $string, $last_char;
The substr command is available in both Awk and Perl.
The whole command could be made into a one liner:
printf qq(%-79.79s%s\n), substr ($string, 0, length ($string ) - 1), substr($string, -1, 1);
awk '/&$/{$80="&"}1' file

sed, replace globally a delimiter with the first part of the line

Lets say I have the following lines:
1:a:b:c
2:d:e:f
3:a:b
4:a:b:c:d:e:f
how can I edit this with sed (or perl) in order to read:
1a1b1c
2d2e2f
3a3b
4a4b4c4d4e4f
I have done with awk like this:
awk -F':' '{gsub(/:/, $1, $0); print $0}'
but takes ages to complete! So looking for something faster.
'Tis a tad tricky, but it can be done with sed (assuming the file data contains the sample input):
$ sed '/^\(.\):/{
s//\1/
: retry
s/^\(.\)\([^:]*\):/\1\2\1/
t retry
}' data
1a1b1c
2d2e2f
3a3b
4a4b4c4d4e4f
$
You may be able to flatten the script to one line with semi-colons; sed on MacOS X is a bit cranky at times and objected to some parts, so it is split out into 6 lines. The first line matches lines starting with a single character and a colon and starts a sequence of operations for when that is recognized. The first substitute replaces, for example, '1:' by just '1'. The : retry is a label for branching too - a key part of this. The next substitution copies the first character on the line over the first colon. The t retry goes back to the label if the substitute changed anything. The last line delimits the entire sequence of operations for the initially matched line.
#!/usr/bin/perl
use warnings;
use strict;
while (<DATA>) {
if ( s/^([^:]+)// ) {
my $delim = $1;
s/:/$delim/g;
}
print;
}
__DATA__
1:a:b:c
2:d:e:f
3:a:b
4:a:b:c:d:e:f
use feature qw/ say /;
use strict;
use warnings;
while( <DATA> ) {
chomp;
my #elements = split /:/;
my $interject = shift #elements;
local $" = $interject;
say $interject, "#elements";
}
__DATA__
1:a:b:c
2:d:e:f
3:a:b
4:a:b:c:d:e:f
Or on the linux shell command line:
perl -aF/:/ -pe '$i=shift #F;$_=$i.join $i,#F;' infile.txt

awk to perl conversion

I have a directory full of files containing records like:
FAKE ORGANIZATION
799 S FAKE AVE
Northern Blempglorff, RI 99xxx
01/26/2011
These items are being held for you at the location shown below each one.
IF YOU ASKED THAT MATERIAL BE MAILED TO YOU, PLEASE DISREGARD THIS NOTICE.
The Waltons. The complete DAXXXX12118198
Pickup at:CHUPACABRA LOCATION 02/02/2011
GRIMLY, WILFORD
29 FAKE LANE
S. BLEMPGLORFF RI 99XXX
I need to remove all entries with the expression Pickup at:CHUPACABRA LOCATION.
The "record separator" issue:
I can't touch the input file's formatting -- it must be retained as is. Each record
is separated by roughly 40+ new lines.
Here's some awk ( this works ):
BEGIN {
RS="\n\n\n\n\n\n\n\n\n+"
FS="\n"
}
!/CHUPACABRA/{print $0}
My stab with perl:
perl -a -F\n -ne '$/ = "\n\n\n\n\n\n\n\n\n+";$\ = "\n";chomp;$regex="CHUPACABRA";print $_ if $_ !~ m/$regex/i;' data/lib51.000
Nothing is returned. I'm not sure how to specify 'field separator' in perl except at the commandline. Tried the a2p utility -- no dice. For the curious, here's what it produces:
eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z
# process any FOO=bar switches
#$FS = ' '; # set field separator
$, = ' '; # set output field separator
$\ = "\n"; # set output record separator
$/ = "\n\n\n\n\n\n\n\n\n+";
$FS = "\n";
while (<>) {
chomp; # strip record separator
if (!/CHUPACABRA/) {
print $_;
}
}
This has to run under someone's Windows box otherwise I'd stick with awk.
Thanks!
Bubnoff
EDIT ( SOLVED ) **
Thanks mob!
Here's a ( working ) perl script version ( adjusted a2p output ):
eval '$'.$1.'$2;' while $ARGV[0] =~ /^([A-Za-z
# process any FOO=bar switches
#$FS = ' '; # set field separator
$, = ' '; # set output field separator
$\ = "\n"; # set output record separator
$/ = "\n"x10;
$FS = "\n";
while (<>) {
chomp; # strip record separator
if (!/CHUPACABRA/) {
print $_;
}
}
Feel free to post improvements or CPAN goodies that make this more idiomatic and/or perl-ish. Thanks!
In Perl, the record separator is a literal string, not a regular expression. As the perlvar doc famously says:
Remember: the value of $/ is a string, not a regex. awk has to be better for something. :-)
Still, it looks like you can get away with $/="\n" x 10 or something like that:
perl -a -F\n -ne '$/="\n"x10;$\="\n";chomp;$regex="CHUPACABRA";
print if /\S/ && !m/$regex/i;' data/lib51.000
Note the extra /\S/ &&, which will skip empty paragraphs from input that has more than 20 consecutive newlines.
Also, have you considered just installing Cygwin and having awk available on your Windows machine?
There is no need for (much)conversion if you can download gawk for windows
Did you know that Perl comes with a program called a2p that does exactly what you described you want to do in your title?
And, if you have Perl on your machine, the documentation for this program is already there:
C> perldoc a2p
My own suggestion is to get the Llama book and learn Perl anyway. Despite what the Python people say, Perl is a great and flexible language. If you know shell, awk and grep, you'll understand many of the Perl constructs without any problems.

How can I change spaces to underscores and lowercase everything?

I have a text file which contains:
Cycle code
Cycle month
Cycle year
Event type ID
Event ID
Network start time
I want to change this text so that when ever there is a space, I want to replace it with a _. And after that, I want the characters to lower case letter like below:
cycle_code
cycle_month
cycle_year
event_type_id
event_id
network_start_time
How could I accomplish this?
Another Perl method:
perl -pe 'y/A-Z /a-z_/' file
tr alone works:
tr ' [:upper:]' '_[:lower:]' < file
Looking into sed documentation some more and following advice from the comments the following command should work.
sed -r {filehere} -e 's/[A-Z]/\L&/g;s/ /_/g' -i
There is a perl tag in your question as well. So:
#!/usr/bin/perl
use strict; use warnings;
while (<DATA>) {
print join('_', split ' ', lc), "\n";
}
__DATA__
Cycle code
Cycle month
Cycle year
Event type ID
Event ID
Network start time
Or:
perl -i.bak -wple '$_ = join('_', split ' ', lc)' test.txt
sed "y/ABCDEFGHIJKLMNOPQRSTUVWXYZ /abcdefghijklmnopqrstuvwxyz_/" filename
Just use your shell, if you have Bash 4
while read -r line
do
line=${line,,} #change to lowercase
echo ${line// /_}
done < "file" > newfile
mv newfile file
With gawk:
awk '{$0=tolower($0);$1=$1}1' OFS="_" file
With Perl:
perl -ne 's/ +/_/g;print lc' file
With Python:
>>> f=open("file")
>>> for line in f:
... print '_'.join(line.split()).lower()
>>> f.close()