Replace characters in certain postion of lines with whitespace - perl

I need to be able to replace, character positions 58-71 with whitespace on every line in a file, on Unix / Solaris.
Extract example:
LOCAX0791LOCPIKAX0791LOC AX0791LOC095200130008PIKAX079100000000000000WL1G011 000092000000000000
LOCAX0811LOCPIKAX0811LOC AX0811LOC094700450006PIKAX0811000000000000006C1G011 000294000000000000
LOCAX0831LOCPIKAX0831LOC AX0831LOC094000180006PIKAX083100000000000000OJ1G011 000171000000000000

Or:
sed -r 's/^(.{57})(.{14})/\1 /' bar.txt
With apologies for the horrible 14 space string.

Simple Perl oneliner
perl -pne 'substr($_, 58, 13) = (" "x13);' inputfile.txt > outputfile.txt

try this:
awk 'BEGIN{FS=OFS=""} {for(i=57;i<=71;i++)$i=" "}1' file
output for your first line:
LOCAX0791LOCPIKAX0791LOC AX0791LOC095200130008PIKAX079 WL1G011

Try this in Perl:
use strict;
use warnings;
while(<STDIN>) {
my #input = split(//, $_);
for(my $i=58; $i<71; $i++) {
$input[$i] = " ";
}
$_ = join(//, #input);
print $_;
}

If you have gawk on your Solaris box, you could try:
gawk 'BEGIN{FIELDWIDTHS = "57 14 1000"} gsub(/./," ",$2)' OFS= file

Related

Move last character of line to specific column -- sed? awk?

I need to replace all lines ending with specific character (say, &) such that this character should be in certain column (say, 80).
Which tool is best?
I have started thinking about sed:
sed 's/\(.*\)&/\1 <what should be here??> &/'
but cannot understand how to replace with variable number of spaces such that & goes to column 80.
Thanks!
Use the /e switch to s/// that tells Perl to evaluate the replacement portion to compute the result.
#! /usr/bin/env perl
use strict;
use warnings;
while (<>) {
s/^(.*)(&)$/$1 . " " x (79 - length $1) . $2/e;
print;
}
Sample run:
$ echo -e 'foo&\n&\nbar &\nbaz' | ./align-ampersands
foo &
&
bar &
baz
If your input contains TAB characters, you will need to use more sophisticated processing.
Not sure if I understand your question correctly but you can try something like (assuming your file is space delimited):
awk '/&$/ {for(i=1;i<=NF;i++) $i=(i==80)?"& "$i:$i}1' yourFile
Awk and Perl will both work. Both have printf and substr:
#! /usr/bin/env perl
use warnings;
use strict;
my $string = "this is some text &";
my $last_char = substr($string, -1, 1);
$string = substr ($string, 0, length ($string ) - 1);
printf qq(%-79.79s%s\n), $string, $last_char;
The substr command is available in both Awk and Perl.
The whole command could be made into a one liner:
printf qq(%-79.79s%s\n), substr ($string, 0, length ($string ) - 1), substr($string, -1, 1);
awk '/&$/{$80="&"}1' file

How to add new number into each line?

I have this line about 500 times in a my file backup.xml
my-company-review/</link>
Is there a way through command line, perl, etc. to add a number into the line after the word review. For example, something like this:
my-company-review1/</link>
my-company-review2/</link>
my-company-review3/</link>
Thanks in advance for the help!
Why not use Perl, as I suggested with your last problem. Once again, this is a sort of hack solution, that only works if there's a maximum of one replacement per line... But it's a quick throw-away program.
perl -e '$count=1; foreach (<>) { s/(my-company-review)(\/<\/link>)/$1$count$2/ && $count++; print; }'
An extra loop will do multiple substitutions on a line:
perl -e '$count=1; foreach (<>) { while(s/(my-company-review)(\/<\/link>)/$1$count$2/) {$count++;} print; }'
That awk solution looks way nicer =)
Here's one way:
perl -i -wpe ' BEGIN { $count = 1; }
++$count
if s{(my-company-review)(/</link>)}{$1$count$2};
' backup.xml
(Disclaimer: not tested.)
You can use awk:
awk 'gsub("/</link>", NR "/</link>")' infile
or perl:
perl -ne 's:/</link>:$./</link>:; print' infile

split file into single lines via delimiter

Hi I have the following file:
>101
ADFGLALAL
GHJGKGL
>102
ASKDDJKJS
KAKAKKKPP
>103
AKNCPFIGJ
SKSK
etc etc;
and I need it in the following format:
>101
ADFGLALALGHJGKGL
>102
ASKDDJKJSKAKAKKKPP
>103
AKNCPFIGJSKSK
how can I do this? perhaps a perl one liner?
Thanks very much!
perl -npe 'chomp if ($.!=1 && !s/^>/\n>/)' input
Remove the newline at the end (chomp) if there is no > at the beginning (!s/^>/\n>/ is false). Also, add a newline at the beginning of the line if this is not the first line ($.!=1) and there is a > at the beginning of the line (s/^>/\n>/).
perl -lne '
if (/^>/) {print}
else{
if ($count) {
print $string . $_;
$count = 0;
} else {
$string = $_;
$count++;
}
}
' file.txt

How to compress 4 consecutive blank lines into one single line in Perl

I'm writing a Perl script to read a log so that to re-write the file into a new log by removing empty lines in case of seeing any consecutive blank lines of 4 or more. In other words, I'll have to compress any 4 consecutive blank lines (or more lines) into one single line; but any case of 1, 2 or 3 lines in the file will have to remain the format. I have tried to get the solution online but the only I can find is
perl -00 -pe ''
or
perl -00pe0
Also, I see the example in vim like this to delete blocks of 4 empty lines :%s/^\n\{4}// which match what I'm looking for but it was in vim not Perl. Can anyone help in this? Thanks.
To collapse 4+ consecutive Unix-style EOLs to a single newline:
$ perl -0777 -pi.bak -e 's|\n{4,}|\n|g' file.txt
An alternative flavor using look-behind:
$ perl -0777 -pi.bak -e 's|(?<=\n)\n{3,}||g' file.txt
use strict;
use warnings;
my $cnt = 0;
sub flush_ws {
$cnt = 1 if ($cnt >= 4);
while ($cnt > 0) {print "\n"; $cnt--; }
}
while (<>) {
if (/^$/) {
$cnt++;
} else {
flush_ws();
print $_;
}
}
flush_ws();
Your -0 hint is a good one since you can use -0777 to slurp the whole file in -p mode. Read more about these guys in perlrun So this oneliner should do the trick:
$ perl -0777 -pe 's/\n{5,}/\n\n/g'
If there are up to four new lines in a row, nothing happens. Five newlines or more (four empty lines or more) are replaced by two newlines (one empty line). Note the /g switch here to replace not only the first match.
Deparsed code:
BEGIN { $/ = undef; $\ = undef; }
LINE: while (defined($_ = <ARGV>)) {
s/\n{5,}/\n\n/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
HTH! :)
One way using GNU awk, setting the record separator to NUL:
awk 'BEGIN { RS="\0" } { gsub(/\n{5,}/,"\n")}1' file.txt
This assumes that you're definition of empty excludes whitespace
This will do what you need
perl -ne 'if (/\S/) {$n = 1 if $n >= 4; print "\n" x $n, $_; $n = 0} else {$n++}' myfile

reformat text in perl

I have a file of 1000 lines, each line in the format
filename dd/mm/yyyy hh:mm:ss
I want to convert it to read
filename mmddhhmm.ss
been attempting to do this in perl and awk - no success - would appreciate any help
thanks
You can do a simple regular expression replacement if the format is really fixed:
s|(..)/(..)/.... (..):(..):(..)$|$2$1$3$4.$5|
I used | as a separator so that I do not need to escape the slashes.
You can use this with Perl on the shell in place:
perl -pi -e 's|(..)/(..)/.... (..):(..):(..)$|$2$1$3$4.$5|' file
(Look up the option descriptions with man perlrun).
Another somehow ugly approach: foreach line of code ($str here) you get from the file do something like this:
my $str = 'filename 26/12/2010 21:09:12';
my #arr1 = split(' ',$str);
my #arr2 = split('/',$arr1[1]);
my #arr3 = split(':',$arr1[2]);
my $day = $arr2[0];
my $month = $arr2[1];
my $year = $arr2[2];
my $hours = $arr3[0];
my $minutes = $arr3[1];
my $seconds = $arr3[2];
print $arr1[0].' '.$month.$day.$year.$hours.$minutes.'.'.$seconds;
Pipe your file to a perl script with:
while( my line = <> ){
if ( $line =~ /(\S+)\s+\(d{2})\/(\d{2})/\d{4}\s+(\d{2}):(\d{2}):(\d{2})/ ) {
print $1 . " " . $3 . $2 . $4 . $5 . '.' . $6;
}
}
Redirect the output however you want.
This says match line to:
(non-whitespace>=1)whitespace>=1(2digits)/(2digits)/4digits
whitepsace>=1(2digits):(2digits):(2digits)
Capture groups are in () numbered 1 to 6 left to right.
Using sed:
sed -r 's|/[0-9]{4} ||; s|/||; s/://; s/:/./' file.txt
delete the year /yyyy
delete the remaining slash
delete the first colon
change the remaining colon to a dot
Using awk:
awk '{split($2,d,"/"); split($3,t,":"); print $1, d[1] d[2] t[1] t[2] "." t[3]}'