Read and write file bit by bit - perl

There is a .jpg file for example or some other file. I want to read it bit by bit. I do this:
open(FH, "<", "red.jpg") or die "Error: $!\n";
my $str;
while(<FH>) {
$str .= unpack('B*', $_);
}
close FH;
Well it gives me $str with 0101001 of the file. After that I do this:
open(AB, ">", "new.jpg") or die "Error: $!\n";
binmode(AB);
print AB $str;
close AB;
but it doesn't work.
How can I do it? and how to do that that it would work regardless of byte order(cross-platform)?

Problems:
You're didn't use binmode when reading too.
It makes no sense to read a binary file line by line since they don't have lines.
You're needlessly using global variables for your file handles.
And the one that answers your question: You didn't reverse the unpack.
open(my $FH, "<", "red.jpg")
or die("Can't open red.jpg: $!\n");
binmode($FH);
my $file; { local $/; $file = <$FH>; }
my $binary = unpack('B*', $file);
open(my $FH, ">", "new.jpg")
or die("Can't create new.jpg: $!\n");
binmode($FH);
print $FH pack('B*', $binary);

Related

Splitting a string into multiple lines and write to file

With Ensembl API, there is a seq() method which prints a sequence in one line (1xN characters). The length of the sequence may be long (order of 10K). So, I want to split the string for every 100 characters. Currently, to write into a file, I use
my $outfilename = $row;
open(my $ofh, '>', $outfilename) or die "Could not open file '$outfilename' $!";
print $ofh $gene->seq();
Where $gene has been defined in the code.
How can I do that?
This is easiest using unpack
open my $ofh, '>', $outfilename or die qq{Could not open file "$outfilename" for output: $!};
print $ofh "$_\n" for unpack '(A100)*', $gene->seq;
my $outfilename = $row;
open(my $ofh, '>', $outfilename) or die "Could not open file '$outfilename' $!";
{
my $gene_seq = $gene->seq();
my #chunks = $gene_seq =~ /(.{1,100})/sg; # split $gene_seq into 100 chars chunks
print $ofh join("\n",#chunks),"\n";
}

Perl iterating through each line in a file and and appending to the end of each line in another file - follow up

I have a follow up question regarding an ealier post.
The post in question is:
Perl iterating through each line in a file and appending to the end of each line in another file
I used:
use warnings;
use strict;
open my $animals, '<', 'File1.txt' or die "Can't open animals: $!";
open my $payloads, '<', 'File2.txt' or die "Can't open payloads: $!";
my #payloads = <$payloads>; #each line of the file into an array
close $payloads or die "Can't close payloads: $!";
while (my $line = <$animals>) {
chomp $line;
print $line.$_ foreach (#payloads);
}
close $animals or die "Can't close animals: $!";
This works fine for files that look like this:
file 1: file 2:
line1 lineA
line2 lineB
line3 lineC
but not for files that look like this:
<01 line1
<02 line2
So what I want to do is the following:
file 1: file 2:
<01 line1 <AA lineAA
<02 line2 <AB lineAB
should become:
file 3:
<01_AA line1lineAA
<01_AB line1lineAB
<02_AA line2lineAA
<02_AB line2lineAB
I have tried to solve it by splitting the strings on the tab using while loops in while loops (see below), but I cannot get it to work.
my script:
#!C:/perl64/bin/perl.exe
use warnings;
use strict;
open my $file1, '<', 'file1.fasta' or die "Can't open file1: $!";
open my $file2, '<', 'file2.fasta' or die "Can't open file2: $!";
open(OUT, '>', 'file3.fasta') or die "Cannot write $!";
while (<$file2>)
{
chomp;
my ($F2_Id, #SF2_seq) = split (/\t/, $_);
while (<$file1>)
{
chomp;
my ($F1_Id, #F1_seq) = split (/\t/, $_);
foreach my $seq (#F1_seq)
{
print OUT $F1_Id,"_",$F2_Id,"\t",$seq.$_ foreach (#F2_seq),"\n";
}
close;
}
}
I started with perl just recently so I can imagine that there are a lot of faults in the script.
I'm sorry for the really long post, but I would appriciate any help.
You can store the id and seq of the first file in an array of arrays.
You also have to replace the < in the second file with _.
#!/usr/bin/perl
use warnings;
use strict;
open my $LEFT, '<', 'file1.fasta' or die "Can't open file1: $!";
open my $RIGHT, '<', 'file2.fasta' or die "Can't open file2: $!";
open my $OUT, '>', 'file3.fasta' or die "Cannot write: $!";
my #left;
while (<$LEFT>) {
chomp;
push #left, [ split /\t/ ];
}
while (<$RIGHT>) {
chomp;
my ($id, $seq) = split /\t/;
$id =~ s/</_/;
print {$OUT} "$_->[0]$id\t$_->[1]$seq\n" for #left;
}
close $OUT or die "Cannot close: $!";

Printing a content of a file to the screen in perl

I Have a perl script which write a few lines into file. (I checked and see that the file is written correctly)
right after that I want to print the content to the screen, the way I'm trying to do it- is to read the file and print it
open (FILE, '>', "tmpLogFile.txt") or die "could not open the log file\n";
$aaa = <FILE>;
close (FILE);
print $aaa;
but I get nothing on the screen, what do I do wrong?
To read you need to specify the open mode as <.
Also, $aaa = <FILE> has scalar context, and only reads a line.
Using print <FILE> you can have list context and read all lines:
open (FILE, '<', "tmpLogFile.txt") or die "could not open the log file\n";
print <FILE>;
close (FILE);
try this:
use strict;
use warnings;
my $filename = 'data.txt';
open(my $fh, '<:encoding(UTF-8)', $filename)
or die "Could not open file '$filename' $!";
while (my $row = <$fh>) {
chomp $row;
print "$row\n";
}
print "done\n"

Fix files "corrupted" by Perl

I have a bunch of files that were created using this code:
use LWP::Simple;
my $xl = get("http://www.somewhere.com/file.xls");
open(my $outf, '>', "C:/file.xls") || die $!;
print $outf $xl;
Only recently did I realize that I should have been using '>:raw' in the filehandle rather than just '>'. So now I have a bunch of files that have been modified in some way that prevents Excel from opening them.
My question is whether there is some processing I can do with Perl to these files to get back to the original Excel files. In other words, is it possible to figure out what edits would have been made to the file that I can undo with a new Perl script?
It converted LF to CRLF. You can simply change any instance of CRLF back to LF.
my $qfn_in = $qfn;
my $qfn_out = $qfn . ".new";
open(my $fh_in, '<:raw', $qfn_in ) or die $!;
open(my $fh_out, '>:raw', $qfn_out) or die $!;
while (<$fh_in>) {
s/\r\n\z/\n/;
print($fh_out $_);
}
Or
my $qfn_in = $qfn;
my $qfn_out = $qfn . ".new";
open(my $fh_in, '<:raw:crlf', $qfn_in ) or die $!;
open(my $fh_out, '>:raw', $qfn_out) or die $!;
print($fh_out $_) while <$fh_in>;
If you have dos2unix, you could also use that. (Though JRFerguson says that his version of it will corrupt files with character 1A in it.)

using perl tie::file with utf encoded file

Can I use Tie::File with an output file of utf encoding? I can't get this to work right.
What I am trying to do is open this utf encoded file, remove the match string from the file and rename the file.
Code:
use strict;
use warnings;
use Tie::File;
use File::Copy;
my ($input_file) = qw (test.txt);
open my $infh, "<:encoding(UTF-16LE)", $input_file or die "cannot open '$input_file': $!";
for (<$infh>) {
tie my #lines, "Tie::File", $_;
shift #lines if $lines[0] =~ m/MyHeader/;
untie #lines;
my ($name) = /^(.*).csv/i;
move($_, $name . ".dat");
}
close $infh
or die "Cannot close '$input_file': $!";
Code: (updated)
my ($input_file) = qw (test.txt);
my $qfn_in = $input_file;
my $qfn_out = $qfn_in . ".dat";
open(my $fh_in, "<:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_in)
or die("Can't open \"$qfn_in\": $!\n");
open(my $fh_out, ">:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_out)
or die("Can't open \"$qfn_out\": $!\n");
while (<$fh_in>) {
next if $. == 1 && /MyHeader/;
print($fh_out $_)
or die("Can't write to \"$qfn_out\": $!");
}
close($fh_in);
close($fh_out) or die("Can't write to \"$qfn_out\": $!");
rename($qfn_out, $qfn_in)
or die("Can't rename: $!\n");
This is underdocumented in the Tie::File perldoc, but you want to pass the discipline => ':encoding(UTF-16LE)' option when you tie the file:
tie my #lines, 'Tie::File', $input_file, discipline => ':encoding(UTF-16LE)'
Note that the third argument is the name of the file to associate with the tied array. Tie::File will automatically open and manage the filehandle for you; there is no need to call open on the file yourself.
#lines now contains the contents of the file, so the next thing to do is check the first line:
if ($lines[0] =~ m/pattern/) {
my $line = shift #lines;
untie #lines; # rewrites, closes the file, w/o first line
my ($name) = $line =~ /^(.*).csv/i;
rename $input_file, "$name.dat";
}
But I concur with TLP that Tie::File is overkill for this job.
(My previous answer about opening a filehandle with the correct encoding and passing the glob as the third arg to Tie::File won't work, as (1) it didn't open the file in read/write mode and (2) even if it did, Tie::File can't or doesn't apply the encoding on both the reading from and writing to the file handle)
my $qfn_in = ...;
my $qfn_out = $qfn_in . ".tmp";
open(my $fh_in, "<:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_in)
or die("Can't open \"$qfn_in\": $!\n");
open(my $fh_out, ">:raw:perlio:encoding(UTF-16le):crlf:utf8", $qfn_out)
or die("Can't open \"$qfn_out\": $!\n");
while (<$fh_in>) {
next if $. == 1 && /MyHeader/;
print($fh_out $_)
or die("Can't write to \"$qfn_out\": $!");
}
close($fh_in);
close($fh_out) or die("Can't write to \"$qfn_out\": $!");
rename($qfn_out, $qfn_in)
or die("Can't rename: $!\n");
(:perlio and :utf8 are workarounds for bugs that existed back then.)
The line:
tie my #lines, "Tie::File", $_;
Tries to tie #lines to a file with the name of each line of test.txt. Since it does not seem to be a file with filenames in it, I suspect that that tie fails.
What you are probably after is using Tie::File on test.txt. If you only want to check the first line of that file, you do not need a loop.
So you'd need something like:
use autodie; #handy to check for fatal errors
tie my #lines, "Tie::File", $input_file;
shift #lines if $lines[0] =~ /MyHeader/;
untie #lines;
if ($input_file =~ /(.+).csv/i) {
move($input_file, $1);
}
But there are simpler ways to check the first line of a file. This will check one file:
perl -we '$_=<>; print if /MyHeader/; print <>;' test.txt > test.dat