Move last character of line to specific column -- sed? awk? - perl

I need to replace all lines ending with specific character (say, &) such that this character should be in certain column (say, 80).
Which tool is best?
I have started thinking about sed:
sed 's/\(.*\)&/\1 <what should be here??> &/'
but cannot understand how to replace with variable number of spaces such that & goes to column 80.
Thanks!

Use the /e switch to s/// that tells Perl to evaluate the replacement portion to compute the result.
#! /usr/bin/env perl
use strict;
use warnings;
while (<>) {
s/^(.*)(&)$/$1 . " " x (79 - length $1) . $2/e;
print;
}
Sample run:
$ echo -e 'foo&\n&\nbar &\nbaz' | ./align-ampersands
foo &
&
bar &
baz
If your input contains TAB characters, you will need to use more sophisticated processing.

Not sure if I understand your question correctly but you can try something like (assuming your file is space delimited):
awk '/&$/ {for(i=1;i<=NF;i++) $i=(i==80)?"& "$i:$i}1' yourFile

Awk and Perl will both work. Both have printf and substr:
#! /usr/bin/env perl
use warnings;
use strict;
my $string = "this is some text &";
my $last_char = substr($string, -1, 1);
$string = substr ($string, 0, length ($string ) - 1);
printf qq(%-79.79s%s\n), $string, $last_char;
The substr command is available in both Awk and Perl.
The whole command could be made into a one liner:
printf qq(%-79.79s%s\n), substr ($string, 0, length ($string ) - 1), substr($string, -1, 1);

awk '/&$/{$80="&"}1' file

Related

Replace characters in certain postion of lines with whitespace

I need to be able to replace, character positions 58-71 with whitespace on every line in a file, on Unix / Solaris.
Extract example:
LOCAX0791LOCPIKAX0791LOC AX0791LOC095200130008PIKAX079100000000000000WL1G011 000092000000000000
LOCAX0811LOCPIKAX0811LOC AX0811LOC094700450006PIKAX0811000000000000006C1G011 000294000000000000
LOCAX0831LOCPIKAX0831LOC AX0831LOC094000180006PIKAX083100000000000000OJ1G011 000171000000000000
Or:
sed -r 's/^(.{57})(.{14})/\1 /' bar.txt
With apologies for the horrible 14 space string.
Simple Perl oneliner
perl -pne 'substr($_, 58, 13) = (" "x13);' inputfile.txt > outputfile.txt
try this:
awk 'BEGIN{FS=OFS=""} {for(i=57;i<=71;i++)$i=" "}1' file
output for your first line:
LOCAX0791LOCPIKAX0791LOC AX0791LOC095200130008PIKAX079 WL1G011
Try this in Perl:
use strict;
use warnings;
while(<STDIN>) {
my #input = split(//, $_);
for(my $i=58; $i<71; $i++) {
$input[$i] = " ";
}
$_ = join(//, #input);
print $_;
}
If you have gawk on your Solaris box, you could try:
gawk 'BEGIN{FIELDWIDTHS = "57 14 1000"} gsub(/./," ",$2)' OFS= file

Perl parsing - mixture of chars, tabs and spaces

I have the following types of line in my code:
MMAPI_CLOCK_OUTPUTS = 1, /*clock outputs system*/
MMAPI_SYSTEM_MANAGEMENT = 0, /*sys man system*/
I want to parse them to get:
'MMAPI_CLOCK_OUTPUTS'
'1'
'clock outputs system'
So I tried:
elsif($TheLine =~ /\s*(.*)s*=s*(.*),s*\/*(.*)*\//)
but this doesn't get the last string 'clock outputs system'
What should the parsing code actually be?
You should escape the slashes, stars and the s for spaces. Instead of writing /, * or s in your regex, write \/, \* and \s:
/\s*(.*)\s=\s*(.*),\s\/\*(.*)\*\//
if($TheLine =~ m%^(\S+)\s+=\s+(\d+),\s+/\*(.*)\*/%) {
print "$1 $2 $3\n"
}
This uses % as an alternative delimiter in order to avoid leaning toothpick syndrome when you escape the / characters.
Try this regex: /^\s*(.*?)\s*=\s*(\d+),\s*\/\*(.*?)\*\/$/
Here is an example in which you can test it:
#!/usr/bin/perl
use strict;
use warnings;
my $str = "MMAPI_CLOCK_OUTPUTS = 1, /*clock outputs system*/\n
MMAPI_SYSTEM_MANAGEMENT = 0, /*sys man system*/";
while ($str =~ /^\s*(.*?)\s*=\s*(\d+),\s*\/\*(.*?)\*\/$/gm) {
print "$1 $2 $3 \n";
}
# Output:
# MMAPI_CLOCK_OUTPUTS 1 clock outputs system
# MMAPI_SYSTEM_MANAGEMENT 0 sys man system

How to compress 4 consecutive blank lines into one single line in Perl

I'm writing a Perl script to read a log so that to re-write the file into a new log by removing empty lines in case of seeing any consecutive blank lines of 4 or more. In other words, I'll have to compress any 4 consecutive blank lines (or more lines) into one single line; but any case of 1, 2 or 3 lines in the file will have to remain the format. I have tried to get the solution online but the only I can find is
perl -00 -pe ''
or
perl -00pe0
Also, I see the example in vim like this to delete blocks of 4 empty lines :%s/^\n\{4}// which match what I'm looking for but it was in vim not Perl. Can anyone help in this? Thanks.
To collapse 4+ consecutive Unix-style EOLs to a single newline:
$ perl -0777 -pi.bak -e 's|\n{4,}|\n|g' file.txt
An alternative flavor using look-behind:
$ perl -0777 -pi.bak -e 's|(?<=\n)\n{3,}||g' file.txt
use strict;
use warnings;
my $cnt = 0;
sub flush_ws {
$cnt = 1 if ($cnt >= 4);
while ($cnt > 0) {print "\n"; $cnt--; }
}
while (<>) {
if (/^$/) {
$cnt++;
} else {
flush_ws();
print $_;
}
}
flush_ws();
Your -0 hint is a good one since you can use -0777 to slurp the whole file in -p mode. Read more about these guys in perlrun So this oneliner should do the trick:
$ perl -0777 -pe 's/\n{5,}/\n\n/g'
If there are up to four new lines in a row, nothing happens. Five newlines or more (four empty lines or more) are replaced by two newlines (one empty line). Note the /g switch here to replace not only the first match.
Deparsed code:
BEGIN { $/ = undef; $\ = undef; }
LINE: while (defined($_ = <ARGV>)) {
s/\n{5,}/\n\n/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
HTH! :)
One way using GNU awk, setting the record separator to NUL:
awk 'BEGIN { RS="\0" } { gsub(/\n{5,}/,"\n")}1' file.txt
This assumes that you're definition of empty excludes whitespace
This will do what you need
perl -ne 'if (/\S/) {$n = 1 if $n >= 4; print "\n" x $n, $_; $n = 0} else {$n++}' myfile

sed, replace globally a delimiter with the first part of the line

Lets say I have the following lines:
1:a:b:c
2:d:e:f
3:a:b
4:a:b:c:d:e:f
how can I edit this with sed (or perl) in order to read:
1a1b1c
2d2e2f
3a3b
4a4b4c4d4e4f
I have done with awk like this:
awk -F':' '{gsub(/:/, $1, $0); print $0}'
but takes ages to complete! So looking for something faster.
'Tis a tad tricky, but it can be done with sed (assuming the file data contains the sample input):
$ sed '/^\(.\):/{
s//\1/
: retry
s/^\(.\)\([^:]*\):/\1\2\1/
t retry
}' data
1a1b1c
2d2e2f
3a3b
4a4b4c4d4e4f
$
You may be able to flatten the script to one line with semi-colons; sed on MacOS X is a bit cranky at times and objected to some parts, so it is split out into 6 lines. The first line matches lines starting with a single character and a colon and starts a sequence of operations for when that is recognized. The first substitute replaces, for example, '1:' by just '1'. The : retry is a label for branching too - a key part of this. The next substitution copies the first character on the line over the first colon. The t retry goes back to the label if the substitute changed anything. The last line delimits the entire sequence of operations for the initially matched line.
#!/usr/bin/perl
use warnings;
use strict;
while (<DATA>) {
if ( s/^([^:]+)// ) {
my $delim = $1;
s/:/$delim/g;
}
print;
}
__DATA__
1:a:b:c
2:d:e:f
3:a:b
4:a:b:c:d:e:f
use feature qw/ say /;
use strict;
use warnings;
while( <DATA> ) {
chomp;
my #elements = split /:/;
my $interject = shift #elements;
local $" = $interject;
say $interject, "#elements";
}
__DATA__
1:a:b:c
2:d:e:f
3:a:b
4:a:b:c:d:e:f
Or on the linux shell command line:
perl -aF/:/ -pe '$i=shift #F;$_=$i.join $i,#F;' infile.txt

How can I remove the timestamp from a filename in Perl?

I have a file which has a line in it as:
/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log`
I need a script which would read this line and remove the time stamp from it, that is:
10.01.21_16.54.18
The script should print the filename without the timestamp and holding the full path, that is:
/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut.log`
Please help as I'm unable to pattern match and output the file path without the timestamp.
echo "/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log" |
perl -pe "s/_\d\d\.\d\d\.\d\d_\d\d\.\d\d\.\d\d//;"
$ perl -e 's{_\d{2}\.\d{2}.\d{2}_\d{2}\.\d{2}.\d{2}}{} and print for #ARGV' /hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log
Path shortened to prevent scrolling:
$ cat paths
CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log
$ perl -pe 's/(_(\d\d(\.\d\d){2})){2}\.log$/.log/' paths
CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut.log
The timestamp is made up of 2 sequences that look like _##.##.##. The subsequences end with 2 sequences of .##. These are the roles of the {2} quantifiers.
while(<>){
#s = split /\// ;
$fullpath=join("/",splice #s , 0, $#s);
#a = split /[_.]/ ,$s[-1];
$newfile="$fullpath/$a[0].$a[-1]";
print $newfile."\n";
}
You can use the following coding
use strict;
use warnings;
my $var; $var=/hosting/logs/U01-ecom-SIT01/CU01-DC05-IFIO_SIT01_NU01-nc3sz1ecmas11/waslogs/SystemOut_10.01.21_16.54.18.log";
$var=~s/_\d\d\.\d\d\.\d\d//g;
# $var=~s/_10\.01\.21_16\.54\.18//g; # You can use this way also
print "$var\n";