truncate all lines in a file while preserving whole words - perl

I'm trying to shorten each line of a file to 96 characters while preserving whole words. If a line is under or equal to 96 chars, I want to do nothing with that line. If it over 96 chars, I want it cut it down to the closest amount less than 96 while preserving whole words. When I run this code, I get a blank file.
use Text::Autoformat;
use strict;
use warnings;
#open the file
my $filename = $ARGV[0]; # store the 1st argument into the variable
open my $file, '<', $filename;
open my $fileout, '>>', $filename.96;
my #file = <$file>; #each line of the file into an array
while (my $line = <$file>) {
chomp $line;
foreach (#file) {
#######
sub truncate($$) {
my ( $line, $max ) = #_;
# always do nothing if already short enough
( length( $line ) <= $max ) and return $line;
# forced to chop a word anyway
if ( $line =~ /\s/ ) {
return substr( $line, 0, $max );
}
# otherwise truncate on word boundary
$line =~ s/\S+$// and return $line;
die; # unreachable
}
#######
my $truncated = &truncate($line,96);
print $fileout "$truncated\n";
}
}
close($file);
close($fileout);

You have no output because you have no input.
1. my #file = <$file>; #each line of the file into an array
2. while (my $line = <$file>) { ...
The <$file> operation line 1 is in list context "consumes" all the input and loads it into #file. The <$file> operation in line 2 has no more input to read, so the while loop does not execute.
You either want to stream from the filehandle
# don't call #file = <$file>
while (my $line = <$file>) {
chomp $line;
my $truncated = &truncate($line, 96);
...
}
Or read from the array of file contents
my #file = <$file>;
foreach my $line (#file) {
chomp $line;
my $truncated = &truncate($line, 96);
...
}
If the input is large, the former format has the advantage of just loading a single line into memory at a time.

Related

Perl - substring keywords

I have a text file where is lot of lines, I need search in this file keywords and if exist write to log file line where is keywords and line one line below and one above the keyword. Now search or write keyword not function if find write all and I dont known how can I write line below and above. Thanks for some advice.
my $vstup = "C:/Users/Omega/Documents/Kontroly/testkontroly/kontroly20220513_154743.txt";
my $log = "C:/Users/Omega/Documents/Kontroly/testkontroly/kontroly.log";
open( my $default_fh, "<", $vstup ) or die $!;
open( my $main_fh, ">", $log ) or die $!;
my $var = 0;
while ( <$default_fh> ) {
if (/\Volat\b/)
$var = 1;
}
if ( $var )
print $main_fh $_;
}
}
close $default_fh;
close $main_fh;
The approach below use one semaphore variable and a buffer variable to enable the desired behavior.
Notice that the pattern used was replaced by 'A` for simplicity testing.
#!/usr/bin/perl
use strict;
use warnings;
my ($in_fh, $out_fh);
my ($in, $out);
$in = 'input.txt';
$out = 'output.txt';
open($in_fh, "< ", $in) || die $!."\n";
open($out_fh, "> ", $out) || die $!;
my $p_next = 0;
my $p_line;
while (my $line = <$in_fh>) {
# print line after occurrence
print $out_fh $line if ($p_next);
if ($line =~ /A/) {
if (defined($p_line)) {
# print previous line
print $out_fh $p_line;
# once printed undefine variable to avoid printing it again in the next loop
undef($p_line);
}
# Print current line if not already printed as the line follows a pattern
print $out_fh $line if (!$p_next);
# toggle semaphore to print the next line
$p_next = 1;
} else {
# pattern not found.
# if pattern was not detected in both current and previous line.
$p_line = $line if (!$p_next);
$p_next = 0;
}
}
close($in_fh);
close($out_fh);

How to check whether one file's value contains in another text file? (perl script)

I would like to check one of the file's values contains on another file. if one of the value contains it will show there is existing bin for that specific, if no, it will show there is no existing bin limit. the problem is I am not sure how to check all values at once.
first DID1 text file value contain :
L84A:D:O:M:
L84C:B:E:D:
second DID text file value contain :
L84A:B:E:Q:X:F:i:M:Y:
L84C:B:E:Q:X:F:i:M:Y:
L83A:B:E:Q:X:F:i:M:Y:
if first 4words value are match, need to check all value for that line.
for example L84A in first text file & second text file value has M . it should print out there is an existing M bin
below is my code :
use strict;
use warnings;
my $filename = 'DID.txt';
my $filename1 = 'DID1.txt';
my $count = 0;
open( FILE2, "<$filename1" )
or die("Could not open log file. $!\n");
while (<FILE2>) {
my ($number) = $_;
chomp($number);
my #values1 = split( ':', $number );
open( FILE, "<$filename" )
or die("Could not open log file. $!\n");
while (<FILE>) {
my ($line) = $_;
chomp($line);
my #values = split( ':', $line );
foreach my $val (#values) {
if ( $val =~ /$values1[0]/ ) {
$count++;
if ( $values[$count] =~ /$values1[$count]/ ) {
print
"Yes ,There is an existing bin & DID\n #values1\n";
}
else {
print "No, There is an existing bin & DID\n";
}
}
}
}
}
I cannot check all value. please help to give any advice on it since this is my first time learning for perl language. Thanks a lot :)
Based on my understanding I write this code:
use strict;
use warnings;
#use ReadWrite;
use Array::Utils qw(:all);
use vars qw($my1file $myfile1cnt $my2file $myfile2cnt #output);
$my1file = "did1.txt"; $my2file = "did2.txt";
We are going to read both first and second files (DID1 and DID2).
readFileinString($my1file, \$myfile1cnt); readFileinString($my2file, \$myfile2cnt);
In first file, as per the OP's request the first four characters should be matched with second file and then if they matched we need to check rest of the characters in the first file with the second one.
while($myfile1cnt=~m/^((\w){4})\:([^\n]+)$/mig)
{
print "<LineStart>";
my $lineChk = $1; my $full_Line = $3; #print ": $full_Line\n";
my #First_values = split /\:/, $full_Line; #print join "\n", #First_values;
If the first four digit matched then,
if($myfile2cnt=~m/^$lineChk\:([^\n]+)$/m)
{
Storing the rest of the content in the same and to be split with colon and getting the characters to be matched with first file contents.
my $FullLine = $1; my #second_values = split /:/, $FullLine;
Then search each letter first and second content which matched line...
foreach my $sngletter(#First_values)
{
If the letters are matched with first and second file its going to be printed.
if( grep {$_ eq "$sngletter"} #second_values)
{
print "Matched: $sngletter\t";
}
}
}
else { print "Not Matched..."; }
This is just information that the line end.
print "<LineEnd>\n"
}
#------------------>Reading a file
sub readFileinString
#------------------>
{
my $File = shift;
my $string = shift;
use File::Basename;
my $filenames = basename($File);
open(FILE1, "<$File") or die "\nFailed Reading File: [$File]\n\tReason: $!";
read(FILE1, $$string, -s $File, 0);
close(FILE1);
}
Read search pattern and data into hash (first field is a key), then go through data and select only field included into pattern for this key.
use strict;
use warnings;
use feature 'say';
my $input1 = 'DID1.txt'; # look for key,pattern(array)
my $input2 = 'DID.txt'; # data - key,elements(array)
my $pattern;
my $data;
my %result;
$pattern = file2hash($input1); # read pattern into hash
$data = file2hash($input2); # read data into hash
while( my($k,$v) = each %{$data} ) { # walk through data
next unless defined $pattern->{$k}; # skip those which is not in pattern hash
my $find = join '|', #{ $pattern->{$k} }; # form search pattern for grep
my #found = grep {/$find/} #{ $v }; # extract only those of interest
$result{$k} = \#found; # store in result hash
}
while( my($k,$v) = each %result ) { # walk through result hash
say "$k has " . join ':', #{ $v }; # output final result
}
sub file2hash {
my $filename = shift;
my %hash;
my $fh;
open $fh, '<', $filename
or die "Couldn't open $filename";
while(<$fh>) {
chomp;
next if /^\s*$/; # skip empty lines
my($key,#data) = split ':';
$hash{$key} = \#data;
}
close $fh;
return \%hash;
}
Output
L84C has B:E
L84A has M

how to write my results to external file in perl

I am trying to read some particular columns from myu data into my output file, i succeed in this reading one cloumn at a time but i want to read some more columns of my interest at a time (i have list of column i want to extract in a separate tex file) because extract individual column and joining them to make one separate file will become hectic to me, here is the code i tried to extract single coulmn,
#!/usr/bin/perl
use strict;
use warnings;
open (DATA, "<file.txt") or die ("Unable to open file");
my $search_string = "IADC512444";
my $header = <DATA>;
my #header_titles = split /\t/, $header;
my $extract_col = 0;
for my $header_line (#header_titles) {
last if $header_line =~ m/$search_string/;
$extract_col++;
}
print "Extracting column $extract_col\n";
while ( my $row = <DATA> ) {
last unless $row =~ /\S/;
chomp $row;
my #cells = split /\t/, $row;
print "$cells[$extract_col] ";
}
is there any possibility to extract all columns at a time instead of only IADC512444 i want from my textfile into outfile on to my harddisc? please help me in solving this problem,
Thanks
If you need to print the contents to a file on disk then you should open a file in write mode and write to it. Also if you want more columns you can do that by accessing corresponding element in the array cells. In this example i am printing the column you are printing plus column 1 and 2
open(OUT_FILE,">path_to_out_file") || die "cant open file...";
while ( my $row = <DATA> ) {
last unless $row =~ /\S/;
chomp $row;
my #cells = split /\t/, $row;
#print "$cells[$extract_col] ";
print OUT_FILE "$cells[$extract_col],$cells[1],$cells[2]\n";
}
close(OUT_FILE)
I have tweaked the code little bit to suit your requirement.
In the variable req_hdr_string you should say the column names which you require separated by ,
So it will be splitted and stored in a hash.
Then from the header i get the position of the column and print only those
#!/usr/bin/perl
use strict;
use warnings;
open (DATA, "<h11.txt") or die ("Unable to open file");
my $req_hdr_string = "abc,ghi,mno,";
my %req_hdrs = ();
my %extract_col = ();
foreach(split /,/, $req_hdr_string)
{
print "req hdr is:$_\n";
$req_hdrs{$_} = $_;
}
my $index = 0;
my $header = <DATA>;
chomp $header;
foreach (split /\t/, $header)
{
print "input is:|$_|\n";
if(exists $req_hdrs{$_})
{
print "\treq index is:$index\n";
$extract_col{$index} = 1;
}
$index++;
}
open(OUT_FILE,">out_file") || die "cant open file...";
while ( my $row = <DATA> )
{
last unless $row =~ /\S/;
chomp $row;
my #cells = split /\t/, $row;
foreach $index (sort keys%extract_col)
{
print OUT_FILE "$cells[$index],";
}
print OUT_FILE "\n";
}
close(OUT_FILE);
close(DATA);

I can't output properly

I'm trying to print a character from a file each time I get a char as input.
My problem is that it prints the whole line. I know it's a logic problem, I just can't figure out how to fix it.
use Term::ReadKey;
$inputFile = "input.txt";
open IN, $inputFile or die "I can't open the file :$ \n";
ReadMode("cbreak");
while (<IN>) {
$line = <IN>;
$char = ReadKey();
foreach $i (split //, $line) {
print "$i" if ($char == 0);
}
}
Move the ReadKey call into the foreach loop.
use strictures;
use autodie qw(:all);
use Term::ReadKey qw(ReadKey ReadMode);
my $inputFile = 'input.txt';
open my $in, '<', $inputFile;
ReadMode('cbreak');
while (my $line = <$in>) {
foreach my $i (split //, $line) {
my $char = ReadKey;
print $i;
}
}
END { ReadMode('restore') }
Your original code has 3 problems:
You only read the character once (outside the for loop)
You read 1 line from input file when testing while (<IN>) { (LOSING that line!) and then another in $line = <IN>; - therefore, only read even #d lines in your logic
print "$i" prints 1 line with no newline, therefore, you don't see characters separated
My scrip reads all the files in a directory, puts then in a list, chooses a random file from the given list.
After that, each time it gets an input char from the user, it prints a char from the file.
#!C:\perl\perl\bin\perl
use Term::ReadKey qw(ReadKey ReadMode);
use autodie qw(:all);
use IO::Handle qw();
use Fatal qw( open );
STDOUT->autoflush(1);
my $directory = "codes"; #directory's name
opendir (DIR, $directory) or die "I can't open the directory $directory :$ \n"; #open the dir
my #allFiles; #array of all the files
while (my $file = readdir(DIR)) { #read each file from the directory
next if ($file =~ m/^\./); #exclude it if it starts with '.'
push(#allFiles, $file); #add file to the array
}
closedir(DIR); #close the input directory
my $filesNr = scalar(grep {defined $_} #allFiles); #get the size of the files array
my $randomNr = int(rand($filesNr)); #generate a random number in the given range (size of array)
$file = #allFiles[$randomNr]; #get the file at given index
open IN, $file or die "I can't open the file :$ \n"; #read the given file
ReadMode('cbreak'); #don't print the user's input
while (my $line = <IN>) { #read each line from file
foreach my $i (split //, $line) { #split the line in characters (including \n & \t)
print "$i" if ReadKey(); #if keys are pressed, print the inexed char
}
}
END {
ReadMode('restore') #deactivate 'cbreak' read mode
}

How to remove new line characters until each line has a specific number of instances of a specific character?

I have a real mess of a pipe-delimited file, which I need to load to a database. The file has 35 fields, and thus 34 pipes. One of the fields is comprised of HTML code which, for some records, includes multiple line breaks. Unfortunately there's no patter as to where the line breaks fall.
The solution I've come up with is to count the number of pipes in each line and until that number reaches 34, remove the new line character from that line. I'm not incredibly well-versed in Perl, but I think I'm close to achieving what I'm looking to do. Any suggestions?
#!/usr/local/bin/perl
use strict;
open (FILE, 'test.txt');
while (<FILE>) {
chomp;
my $line = $_;
#remove null characters that are included in file
$line =~ tr/\x00//;
#count number of pipes
my $count = ($line =~ tr/|//);
#each line should have 34 pipes
while ($count < 34) {
#remove new lines until line has 34 pipes
$line =~ tr/\r\n//;
$count = ($line =~ tr/|//);
print "$line\n";
}
}
This should work I guess.
#!/usr/bin/perl
use strict;
open (FILE, 'test.txt');
my $num_pipes = 0, my $line_num = 0;
my $tmp = "";
while (<FILE>)
{
$line_num++;
chomp;
my $line = $_;
$line =~ tr/\x00//; #remove null characters that are included in file
$num_pipes += ($line =~ tr/|//); #count number of pipes
if ($num_pipes == 34 && length($tmp))
{
$tmp .= $line;
print "$tmp\n";
# Reset values.
$tmp = "";
$num_pipes = 0;
}
elsif ($num_pipes == 34 && length($tmp) == 0)
{
print "$line\n";
$num_pipes = 0;
}
elsif ($num_pipes < 34)
{
$tmp .= $line;
}
elsif ($num_pipes > 34)
{
print STDERR "Error before line $line_num. Too many pipes ($num_pipes)\n";
$num_pipes = 0;
$tmp = "";
}
}
Twiddle with $/, the input record separator?
while (!eof(FILE)) {
# assemble a row of data: 35 pipe separated fields, possibly over many lines
my #fields = ();
{
# read 34 fields from FILE:
local $/ = '|';
for (1..34) {
push #fields, scalar <FILE>;
}
} # $/ is set back to original value ("\n") at the end of this block
push #fields, scalar <FILE>; # read last field, which ends with newline
my $line = join '|', #fields;
... now you can process $line, and you already have the #fields ......
}