File is adding one extra space in each line - perl

I am trying to add all the elements in array using push . then i stored into another file
but begining of file i am seeing one whitespeace in every thing ..
What is the issue .. any one before face this issue .
open FILE , "a.txt"
while (<FILE>)
{
my $temp =$_;
push #array ,$temp;
}
close(FILE);
open FILE2, "b.txt";
print FILE2 "#array";
close FILE2;

When you quote an array variable like this: "#array" it gets interpolated with spaces. That's where they come from in your output. So do not quote if you do not need or want this sort of interpolation.
Now let's rewrite your program to modern Perl.
use strict;
use warnings FATAL => 'all';
use autodie qw(:all);
my #array;
{
open my $in, '<', 'a.txt';
#array = <$in>;
}
{
open my $out, '>', 'b.txt';
print {$out} #array;
}

You put quotes around "#array". That makes it a string interpolation, which for arrays is equivalent to join($", #array). The default value for $" is (guess what?) a space.
Try
print FILE2 #array;

open usually takes another argument that specifies whether the file is opened for input or for output (or for both or for some other special case). You have omitted this argument, and so by default FILE2 is an input filehandle.
You wanted to say
open FILE2, '>', "b.txt"
If you put the line
use warnings;
at the beginning of every Perl script, the interpreter will catch many issues like this for you.

Related

Perl copying specific lines of VECT File

I want to copy lines 7-12 of files, like
this example .vect file,
into another .vect file in the same directory.
I want each line, to be copied twice, and the two copies of each line to be pasted consecutively in the new file.
This is the code I have used so far, and would like to continue using these methods/packages in Perl.
use strict;
use warnings;
use feature qw(say);
# This method works for reading a single file
my $dir = "D:\\Downloads";
my $readfile = $dir ."\\2290-00002.vect";
my $writefile = $dir . "\\file2.vect";
#open a file to read
open(DATA1, "<". $readfile) or die "Can't open '$readfile': $!";;
# Open a file to write
open(DATA2, ">" . $writefile) or die "Can't open '$writefile': $!";;
# Copy data from one file to another.
while ( <DATA1> ) {
print DATA2 $_;
}
close( DATA1 );
close( DATA2 );
What would be a simple way to do this using the same opening and closing file syntax I have used above?
Just modify the print line to
print DATA2 $_, $_ if 7 .. 12;
See Range Operators in "perlop - Perl operators and precedence" for details.
It's worth remembering the
Tie::File
module which maps a file line by line to a Perl array and allows you to manipulate text files using simple array operations. It can be slow when working with large amounts of data, but it is ideal for the majority of applications involving regular text files
Copying a range of lines from one file to another becomes a simple matter of copying an array slice. Remember that the file starts with line one in array element zero, so lines 7 to 12 are at indexes 6...11
This is the Perl code to do what you ask
use strict;
use warnings;
use Tie::File;
chdir 'D:\Downloads' or die $!;
tie my #infile, 'Tie::File', '2290-00002.vect' or die $!;
tie my #outfile, 'Tie::File', 'file2.vect' or die $!;
#outfile = map { $_, $_ } #infile[6..11];
Nothing else is required. Isn't that neat?

Perl Script can't use Tie::File

I'm trying to run a perl script which uses the Tie::File module.
What it basically is supposed to do is read in all the files from the current directory, cut off the last line of the first document, then the first and last line of every other document and the first line of the last document, then write everything to a new document.
When I'm trying to run my script (which might have some mistakes in it...I'd be happy if someone could correct them if you find any) I'm getting an errormessage:
Can't locate object method "TIEARRAY" via package "TIE:File" at script.pl line 28, <$fh> line 7.
I've marked line 28 in the code.
I've installed the latest version of Tie::File and checked with
cpan Tie::File
and
cpan Tie::Array
if everything is installed, I received Tie::Array is up to date (v1.06) and Tie::File is up to date (v1.00) from the terminal, so they have to be installed correctly.
#!/usr/bin/perl
use Cwd;
use Tie::File;
use Tie::Array;
my $cwd = getcwd();
my $buff = '';
# Get all files in cwd.
#my #files = grep { -f && /\.txt$/ } readdir $cwd;
my #files = grep ( -f ,<*.txt>);
# Cut off footer of first (files[0]) file
print 'Opening' . $files[0] . "\n";
use Tie::File;
tie (#lines, Tie::File, $files[0]) or die "can't update $file: $!";
delete $lines[-1];
# Cut off header and footer of $files [1] to $files[-2]
for ($a = 1, $a < $#files-1, $a++){
print 'Opening' . $file . "\n";
use Tie::FILE;
tie (#lines, TIE::File, $files[$a]) or die "can't update $file: $!"; ####this is line 28
delete $lines[0];
delete $lines[-1];
open (FILE, "<", $files[$a]) or die $!;
while (my $line =<FILE>) {
$buff .= $line;
}
close FILE;
}
print 'Opening' . $files[-1] . "\n";
use Tie::FILE;
tie (#lines, TIE::File, $files[-1]) or die "can't update $file: $!";
delete $lines[0];
open (lastfile, "<", $files[-1]) or die "can't open $files[-1]: $!";
while (my $line =<lastfile>) {
$buff .= $line;
}
close lastfile;
# Write the buffer to a new file.
my $allfilename = $cwd.'/Trace.txt';
print 'Writing all files into new file: ' . $allfilename . "\n";
open $outputfile, ">".$allfilename or die $!;
# Write the buffer into the output file.
print $outputfile $buff;
close $outputfile;
Perl module names are case sensitive. The module is called Tie::File, not Tie::FILE or TIE::File.
Your program is frankly a bit of a mess. You seem to be trying things in the hope that they work but without any real reasoning.
I have refactored your code to do what I think you want below. Here are the main changes I have made
You must always add use strict and use warnings to every Perl program you write, and declare all your variables with my as close as possible to their first point of use. Those simple measures alone will save you from a lot of simple errors that you will otherwise overlook
You don't need Tie::Array or Cwd. They are irrelevant to this program
Your tie statement needs a string as the second parameter, so you need to use 'Tie::File' instead of Tie::File
Your output file Trace.txt will be found by the <*.txt> glob, so unless you take measures to specifically exclude it your program will copy trim the first and last lines and copy the contents of that file to itself. In my program I have simply checked in the for loop whether the current file name is Trace.txt and skipped it if so
There is no point in accumulating the data in a buffer $buff. You may as well just write the data to the file as you encounter it
The lines in the tied array #lines have no trailing newline, so you will presumably want to add one when you write to the file
As has been discussed in the comments, you are using Tie::FILE and TIE::File as well as the correct Tie::File. And you have written use Tie::File (and its variations) four times in total. Sure it doesn't stop the program from working, but it is a major indication of foggy thinking, and that you are just statements around in the hope that they make your program work
Using delete on anything other than the last element of an array just sets that element to undef: it doesn't delete it, and all that happens in the tied file is that the text is removed leaving just a newline. You need to use splice instead
Separating your files into the first, the last, and the rest is unnecessary and makes your code illegible. In my program below I have used a single loop that removes the first line of the file unless it's the first fil, and removes the last line of the file unless it's the last file. It's far easier to read that way
Lastly, I'm not at all sure that you want to remove the first and last lines from the existing files, or if you just want all the data copied to your output file except those lines. I have written my program according to your specification, but bear in mind that the files will get shorter by two lines every time you run it, and that probably isn't the effect you want. If you have a different requirement and can't see how to modify the code to achieve it then please ask another question.
I hope this helps you.
use strict;
use warnings;
use Tie::File;
my #files = grep -f, glob '*.txt';
my $all_filename = 'Trace.txt';
open my $out_fh, '>', $all_filename or die qq{Unable to open "$all_filename" for output: $!};
for my $i ( 0 .. $#files ) {
my $file = $files[$i];
next if $file eq $all_filename;
print "Opening $file\n";
tie my #lines, 'Tie::File', $file or die qq{Can't update "$file": $!};
splice #lines, 0, 1 unless $i == 0;
splice #lines, -1, 1 unless $i == $#files;
print $out_fh "$_\n" for #lines;
}
close $out_fh;

Read specific part of a filehandle in PERL

Hi I have a large file I would like to read. To save resource I want to read it slowly, one line at a time. However I'm wondering if there is a way to read specific line from a filehandle instead. For example, say I have a test.txt file containing a billion numbers starting with 1. Each number is on a separate line.
1
2
3
...
so now what I currently do to get say line 10 is this,
open (FILE, "< test.txt") or die "$!";
#reads = <FILE>
print $reads[9];
however, is there a way I can access certain part of the FILE without reading everything into a big array, say I want line 10 instead.
something like FILE->[9]
-
thanks for helping in advance!
Two methods, do line by line processing your skip to the desired line. You can use the Input Line Number variable, $. to help:
use strict;
use warnings;
use autodie;
my $line10 = sub {
open my $fh, '<', 'text.txt';
while (<$fh>) {
return $_ if $. == 10;
}
}->();
Alternatively, you could use Tie::File as you already noticed. However, while that interface is very convenient, and I'd recommend it's use, it also will loop through the file behind the scenes.
use strict;
use warnings;
use autodie;
use Tie::File;
tie my #array, 'Tie::File', 'text.txt' or die "Can't open text.txt: $!";
print $array[9] // die "Line 10 does not exist";
For memory purposes large files should be read in using a while loop which will read the file line by line:
open my $fh, '<', 'somefile.txt';
while ( my $line = <$fh> ) {
//read in text line by line
}
Either way to get at that line number you are going to have to read the whole file in. Now I would recommend using the while loop and a counter to print / save the line you are looking for.

Replace a number incrementally in same line using Perl

I have some XML data like this
<!--Q1: some text--><!--Q1: some text--><!--Q1: some text-->
I want to replace this query number in order like so
<!--Q1: some text--><!--Q2: some text--><!--Q3: some text-->..
I wrote this Perl script
#!/usr/bin/perl -w
$b=1;
use strict;
open(FILE, "<text.xml") || die "File not found";
my #lines = <FILE>;
close(FILE);
my #newlines;
while<> {
$_ =~ s/<!--Q[0-9]{1,2}/<!--Q$b/g;
$b++;
push(#newlines,$_);
}
open(FILE, ">text.xml") || die "File not found";
print FILE #newlines;
but it only makes one replacement in each line.
My text:
<!--Q2: text-->
<!--Q3: text--><!--Q8: text-->
<!--Q10: text-->
output
<!--Q1: text-->
<!--Q**2**: text--><!--Q**2**: text-->
<!--Q3: text-->
There are many problems with your program
You must always use strict and use warnings as the first lines of your program
You should use lexical file handles (scalar variables) instead of global names
You should use the three-parameter form of open, and include the built-in variable $! in the die string if open fails
You should never use $a or $b as variable names. They don't help to document the program at all, and they are used internally by perl so you can't rely on their contents
You have read the entirety of the file into #lines, and then expect there to be more to read in your while loop. You have already reached end of file, so the loop is never entered
It is pointless to test for exactly one or two digits following <!--Q. If there is an occurrenece of three or more digits then the regex will still match, but only the first two digits will be replaced
There is no reasons to push the modified lines to an array and print them all later. Just print each one as you change it
Use this instead. Version 10.0 of Perl 5 is required for the \K construct in the regex. It has been around since 2007, so if you are behind with your updates then you should really get that fixed.
use strict;
use warnings;
use 5.010;
open my $in, '<', 'text.xml' or die $!;
open my $out, '>', 'newtext.xml' or die $!;
my $n = 0;
while (<$in>) {
s/<!--Q\K\d+/++$n/ge;
print $out $_;
}
output
<!--Q1: text-->
<!--Q2: text--><!--Q3: text-->
<!--Q4: text-->
Update
If you don't have version 10 of Perl 5 available (and you really should - it is six years old and a major update) then you can write the regular expression like this
s/(<!--Q)\d+/$1.++$n/ge;

Perl incorrectly adding newline characters?

This is my tab delimited input file
Name<tab>Street<tab>Address
This is how I want my output file to look like
Street<tab>Address<tab>Address
(yes duplicate the next two columns) My output file looks like this instead
Street<tab>Address
<tab>Address
What is going on with perl? This is my code.
open (IN, $ARGV[0]);
open (OUT, ">output.txt");
while ($line = <IN>){
chomp $line;
#line=split/\t/,$line;
$line[2]=~s/\n//g;
print OUT $line[1]."\t".$line[2]."\t".$line[2]."\n";
}
close( OUT);
First of all, you should always
use strict and use warnings for even the most trivial programs. You will also need to declare each of your variables using my as close as possible to their first use
use lexical file handles and the three-parameter form of open
check the success of every open call, and die with a string that includes $! to show the reason for the failure
Note also that there is no need to explicitly open files named on the command line that appear in #ARGV: you can just read from them using <>.
As others have said, it looks like you are reading a file of DOS or Windows origin on a Linux system. Instead of using chomp, you can remove all trailing whitespace characters from each line using s/\s+\z//. Since CR and LF both count as "whitespace", this will remove all line terminators from each record. Beware, however, that, if trailing space is significant or if the last field may be blank, then this will also remove spaces and tabs. In that case, s/[\r\n]+\z// is more appropriate.
This version of your program works fine.
use strict;
use warnings;
#ARGV = 'addr.txt';
open my $out, '>', 'output.txt' or die $!;
while (<>) {
s/\s+\z//;
my #fields = split /\t/;
print $out join("\t", #fields[1, 2, 2]), "\n";
}
close $out or die $!;
If you know beforehand the origin of your data file, and know it to be a DOS-like file that terminates records with CR LF, you can use the PerlIO crlf layer when you open the file. Like this
open my $in, '<:crlf', $ARGV[0] or die $!;
then all records will appear to end in just "\n" when they are read on a Linux system.
A general solution to this problem is to install PerlIO::eol. Then you can write
open my $in, '<:raw:eol(LF)', $ARGV[0] or die $!;
and the line ending will always be "\n" regardless of the origin of the file, and regardless of the platform where Perl is running.
Did you try to eliminate not only the "\n" but also the "\r"???
$file[2] =~ s/\r\n//g;
$file[3] =~ s/\r\n//g; # Is it the "good" one?
It could work. DOS line endings could also be "\r" (not only "\n").
Another way to avoid end of line problems is to only capture the characters you're interested in:
open (IN, $ARGV[0]);
open (OUT, ">output.txt");
while (<IN>) {
print OUT "$1\t$2\t$2\n" if /^(\w+)\t\w+\t(\w+)\s*/;
}
close( OUT);