Find and replace a string in a text file in Perl - perl

I really have no idea how to do this. I tried a lot methods and I don't know why none works. Here is a sample of what I tried:
{
open my $fh1, '<', 'hex1.txt';
open my $fh2, '<', 'hex2.txt';
until ( eof $fh1 or eof $fh2 ) {
my #l1 = map hex, split //, <$fh1>;
my #l2 = map hex, split //, <$fh2>;
my $n = #l2 > #l1 ? #l2 : #l1;
my #sum = map {
no warnings 'uninitialized';
$l1[$_] + $l2[$_];
} 0 .. $n - 1;
#sum = map { sprintf '%X', $_ } #sum;
open my $out, '>', 'hexsum.txt';
print {$out} #sum, "\n";
}
close $fh1;
close $fh2;
}
{
open my $IN, "<", 'hexsum.txt';
open my $OUT, ">", 'sym.txt';
while ( my $linie = <$IN> ) {
$linie =~ s/40/20/g;
print $OUT $linie;
}
close $IN;
close $OUT;
}
{
my $input = do {
open my $in, '<', 'hexsumspace.txt';
local $/;
<$in>;
};
open my $out, '>', 'sym.txt';
print $out pack 'H*', $input;
}
How can I change it everywhere I find the value 40 with the value 20?

use strict;
use warnings;
open my $OUT, ">", 'output.txt';
open my $IN, "<", 'input.txt';
while (my $line = <$IN>) {
$line =~ s/40/20/g;
print $OUT $line;
}
close $IN;
close $OUT;

Here's a Perl one-liner from the command line:
perl -pe 's/40/20/g' input.txt > output.txt

Related

How can I remove the pattern that are matched from original file

How can I remove the matched pattern that I print in output file from the original file. I used below script to print out all matched into the FILE6
open FILE4,'<RM' or die $!;
open FILE5,'<OR' or die $!;
open FILE6, '>Compare3.txt' or die $!;
my #array3 = <FILE4>;
my #array4 = <FILE5>;
foreach $x (#array3) {
if($x =~ /(\S+) (\d+) (\S+)/) {
$temp_allreg = $3;
foreach $y (#array4) {
if($y =~ /\b$temp_allreg\b/i) {
print FILE6 "$x$y\n";
}
}
}
}
Based on above coding, I print out the matched data from 2 file to the output file which is FILE6. So how to modify this code to remove matched pattern that print out in FILE 6 from original file which is FILE4 and FILE5. So that FILE4 and FILE5 only leave the pattern/data that are not matched.
use strict;
use warnings;
my $rm_qfn = 'RM';
my $or_qfn = 'OR';
my $out_qfn = 'Compare3.txt';
open(my $rm_fh, '<', $rm_qfn)
or die("Can't open \"$rm_qfn\": $!\n");
open(my $or_fh, '<', $or_qfn)
or die("Can't open \"$or_qfn\": $!\n");
open(my $out_fh, '>', $out_qfn)
or die("Can't create \"$out_qfn\": $!\n");
open(my $out_rm_fh, '>', "$rm_qfn.tmp")
or die("Can't create \"$rm_qfn.tmp\": $!\n");
open(my $out_or_fh, '>', "$or_qfn.tmp")
or die("Can't create \"$or_qfn.tmp\": $!\n");
chomp( my #ors = <$or_fh> );
my #matched_ors;
while (my $rm = <$rm_fh>) {
chomp($rm);
my $matched_rm = 0;
if (my ($all_reg) = $rm =~ /\S+ \d+ (\S+)/) {
for my $or_idx (0..$#ors) {
my $or = $ors[$or_idx];
if ($or =~ /\b\Q$all_reg\E\b/i) {
++$matched_rm;
++$matched_ors[$or_idx];
print($out_fh "$rm$or\n");
}
}
}
if (!$matched_rm) {
print($out_rm_fh "$rm\n");
}
}
close($rm_fh);
close($or_fh);
for my $or_idx (0..$#ors) {
if (!$matched_ors[$or_idx]) {
my $or = $ors[$or_idx];
print($out_or_fh "$or\n");
}
}
close($out_rm_fh);
close($out_or_fh);
rename("$rm_qfn.tmp", $rm_qfn)
or die("Can't rename \"$rm_qfn.tmp\" to \"$rm_qfn\": $!\n");
rename("$or_qfn.tmp", $or_qfn)
or die("Can't rename \"$or_qfn.tmp\" to \"$or_qfn\": $!\n");
Do it in the same loop?
use File::Slurp qw(read_file write_file);
my $data = read_file $filename, {binmode => ':utf8'};
foreach $x (#array3) {
if($x =~ /(\S+) (\d+) (\S+)/) {
$temp_allreg = $3;
foreach $y (#array4) {
if($y =~ /\b$temp_allreg\b/i){
#print to output file
print FILE6 "$x$y\n";
#remove from input file
$data =~ s/$x$y//g;
write_file $filename, {binmode => ':utf8'}, $data;
}
}
}
}
Since you can't use File::Slurp module, you can consider writing those subroutines yourself.
sub read_file {
my ($filename) = #_;
open my $in, '<:encoding(UTF-8)', $filename or die "Could not open '$filename' for reading $!";
local $/ = undef;
my $all = <$in>;
close $in;
return $all;
}
sub write_file {
my ($filename, $content) = #_;
open my $out, '>:encoding(UTF-8)', $filename or die "Could not open '$filename' for writing $!";;
print $out $content;
close $out;
return;
}
See: How to replace a string in a file [perlmaven]

Perl - Comparison of Files using specific substrings

i ve writted thsi script to compare lines of two files, and output common/not common lines into two different files. The script is :
use strict;
use warnings;
use autodie;
my $f1 = shift || "CSP8216.TXT";
my $f2 = shift || "CSP8217.TXT";
open my $fh1, '>', 'file1';
open FH2, '>', 'file2';
my %results;
open my $file1, '<', $f1;
while (my $line = <$file1>) {
$results{$line} = 1
}
open my $file2, '<', $f2;
while (my $line = <$file2>) {
$results{$line}++
}
foreach my $line (sort { $results{$b} <=> $results{$a} } keys %results)
{
if ($results{$line} >= 1)
{
print {$fh1} "$line";
}
else
{
print FH2 "$line";
}
}
My problem is when i try to mod this script but run the comparisons based on specific substrings of each line, ie :
If a specific substring of a line of file A matches another specific substring of a line in File B, then output said /entire/ line of File B into fh1, otherwise output it into fh2.
I tried this, but it doesnt work - really new to Perl still, any help will be really appreciated :
use strict;
use warnings;
use autodie;
my $f1 = shift || "CSP8216.TXT";
my $f2 = shift || "CSP8216.TXT";
open my $fh1, '>', 'file1';
open FH2, '>', 'file2';
my %results;
open my $file1, '<', $f1;
while (my $line = <$file1>)
{
my $sbs1 = substr($line, 0, 10);
$results{$sbs1} = 1
}
open my $file2, '<', $f2;
while (my $line = <$file2>)
{
my $sbs2 = substr($line, 0, 10);
$results{$sbs2}++
}
foreach my $line (sort { $results{$b} <=> $results{$a} } keys %results)
{
if ($results{$line} >= 1)
{
print {$fh1} "$line";
}
else
{
print FH2 "$line";
}
}
This does not work, and i have a feeling its a problem in the logic, it outputs just the substrings in a single line.
As per my comment, we need to keep the lines from file A and file B separate if we need to support that a single line can appear twice in one file.
On option is to solve the basic problem like this
open my $fh1, '<', $filename1 or die "Can't open $file1: $!";
while (my $line = <$fh1>) {
$combined{$line} = $file1{$line} = 1;
}
open my $fh2, '<', $filename2 or die "Can't open $file2: $!";
while (my $line = <$fh2>) {
$combined{$line} = $file2{$line} = 1;
}
open my $out1, '>', $outfilename1 or die "...";
open my $out2, '>', $outfilename2 or die "...";
for my $line (keys %combined) {
if ($file1{$line} && $file2{$line}) {
print $out1 $line;
} else {
print $out2 $line;
}
}
To solve the substring issue I would keep the substrings from each file as keys in the hashes. But instead of just storing the true value I would store the full string as value in %file2:
open my $fh1, '<', $filename1 or die "Can't open $file1: $!";
while (my $line = <$fh1>) {
my $substr = substr($line, 0, 10);
$combined{$line} = $file1{$substr} = 1;
}
open my $fh2, '<', $filename2 or die "Can't open $file2: $!";
while (my $line = <$fh2>) {
my $substr = substr($line, 20, 30);
$combined{$line} = 1;
$file2{$substr} = $line;
}
open my $out1, '>', $outfilename1 or die "...";
open my $out2, '>', $outfilename2 or die "...";
for my $line (keys %combined) {
my $substr1 = substr($line, 0, 10);
my $substr2 = substr($line, 20, 30);
if ($file1{$substr1} && $file2{$substr2}) {
print $out1 $file2{$substr2};
} else {
print $out2 $line;
}
}
This works for me
#!/usr/bin/perl
use warnings;
use autodie;
my %results;
my $f1 = shift || "CSP8216.TXT";
my $f2 = shift || "CSP8217.TXT";
open my $fh1, '>', 'file1';
open my $fh2, '>', 'file2';
open my $file1, '<', $f1;
while (my $line = <$file1>) {
my $sbs1 = substr($line, 0, 10);
$results{$sbs1} = 1
}
open my $file2, '<', $f2;
while (my $line = <$file2>) {
my $sbs2 = substr($line, 0, 10);
if (!$results{$sbs2}) {
$results{$sbs2} = 1;
}
$results{$sbs2}++
}
foreach my $line (sort { $results{$b} <=> $results{$a} } keys %results) {
if ($results{$line} > 1) {
print {$fh1} "$line";
}
else {
print {$fh2} "$line";
}
}

Sum of hex values in perl

I am trying to input 2 ascii files and output them into 2 hex files (sym.txt ,sym2.txt) and then do the sum of two hex files and output the reverse of the value into a last file (symout.txt).I really don't see what I'm doing wrong ... Thank you for the help in advance :D.
use strict;
use warnings 'all';
open my $in, "<", "f1.txt";
my $input = do { local $/; <$in> };
open my $out, ">", "sym.txt";
print $out unpack 'H*', $input;
open my $in1, "<", "f2.txt";
my $input1 = do { local $/; <$in1> };
open my $out1, ">", "sym2.txt";
print $out1 unpack 'H*', $input1;
open(my $fh1, '<', 'sym.txt') or die $!;
open(my $fh2, '<', 'sym2.txt') or die $!;
open my $fh_out, '>', 'symout.txt' or die $!;
until ( eof $fh1 or eof $fh2 ) {
my #l1 = map hex, split '', <$fh1>;
my #l2 = map hex, split '', <$fh2>;
my $n = #l2 > #l1 ? #l2 : #l1;
my #sum = map {
no warnings 'uninitialized';
$l1[$_] + $l2[$_];
} 0 .. $n-1;
#sum = map { sprintf '%X', $_ } #sum;
print { $fh_out } reverse(#sum), "\n";
}
The main problem is that you haven't closed $out or $out1, so the data that has been printed to those handles is still in memory waiting to be flushed
It's best to use lexical file handles (as you are doing) and add blocks so that the handles are closed implicitly when they go out of scope
Here's an example of what I mean. Note that I've also added use autodie to avoid having to check the status of every open call (which you should have done but didn't!)
use strict;
use warnings 'all';
use v5.14;
use autodie;
{
my $input = do {
open my $in, '<', 'f1.txt';
local $/;
<$in>
};
open my $out, '>', 'sym.txt';
print $out unpack 'H*', $input;
}
{
my $input = do {
open my $in, '<', 'f2.txt';
local $/;
<$in>
};
open my $out, '>', 'sym2.txt';
print $out unpack 'H*', $input;
}
open my $fh1, '<', 'sym.txt';
open my $fh2, '<', 'sym2.txt';
until ( eof $fh1 or eof $fh2 ) {
my #l1 = map hex, split //, <$fh1>;
my #l2 = map hex, split //, <$fh2>;
my $n = #l2 > #l1 ? #l2 : #l1;
my #sum = map {
no warnings 'uninitialized';
$l1[$_] + $l2[$_];
} 0 .. $n-1;
#sum = map { sprintf '%X', $_ } #sum;
open my $out, '>', 'symout.txt';
print { $out } reverse(#sum), "\n";
}

How can I print lines from a file to separate files

I have a file which has lines like this:
1 107275 447049 scaffold1443 465 341154 -
There are several lines which starts with one, after that a blank line separates and start lines with 2 and so on.
I want to separate these lines to different files based on their number.
I wrote this script but it prints in every file only the first line.
#!/usr/bin/perl
#script for choosing chromosome
use strict;
my $filename= $ARGV[0];
open(FILE, $filename);
while (my $line = <FILE>) {
my #data = split('\t', $line);
my $length = #data;
#print $length;
my $num = $data[0];
if ($length == 6) {
open(my $fh, '>', $num);
print $fh $line;
}
$num = $num + 1;
}
please, i need your help!
use >> to open file for appending to end of it as > always truncates desired file to zero bytes,
use strict;
my $filename = $ARGV[0];
open(my $FILE, "<", $filename) or die $!;
while (my $line = <$FILE>) {
my #data = split('\t', $line);
my $length = #data;
#print $length;
my $num = $data[0];
if ($length == 6) {
open(my $fh, '>>', $num);
print $fh $line;
}
$num = $num + 1;
}
If I understand your question correctly, then paragraph mode might be useful. This breaks a record on two or more new-lines, instead of just one:
#ARGV or die "Supply a filename\n";
my $filename= $ARGV[0];
local $/ = ""; # Set paragraph mode
open(my $file, $filename) or die "Unable to open '$filename' for read: $!";
while (my $lines = <$file>) {
my $num = (split("\t", $lines))[0];
open(my $fh, '>', $num) or die "Unable to open '$num' for write: $!";
print $fh $lines;
close $fh;
}
close $file;

Extracting specific multiple line of records that is pipe delimited in perl

I have a file that looks like
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
I want to separate the records by country. I have stored each line into array variable #fields
my #fields = split(/\|/, $_ );
making $fields[3] as my basis for sorting it. I wanted it to separate into 2 output text files
OUTPUT TEXT FILE 1:
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
OUTPUT TEXT FILE 2
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
Putting all that is from JPN to output text 1 & non-JPN country to output text file 2
here's the code that what trying to work out
use strict;
use warnings;
use Data::Dumper;
use Carp qw(croak);
my #fields;
my $tmp_var;
my $count;
;
my ($line, $i);
my $filename = 'data.txt';
open(my $input_fh, '<', $filename ) or croak "Can't open $filename: $!";
open(OUTPUTA, ">", 'JPN.txt') or die "wsl_reformat.pl: could not open $ARGV[0]";
open(OUTPUTB, ">", 'Non-JPN.txt') or die "wsl_reformat.pl: could not open $ARGV[0]";
my $fh;
while (<$input_fh>) {
chomp;
my #fields = split /\|/;
if ($fields[0] eq 'NAME') {
for ($i=1; $i < #fields; $i++) {
if ($fields[3] eq 'JPN') {
$fh = $_;
print OUTPUTA $fh;
}
else {
$fh = $_;
print OUTPUTB $fh;
}
}
}
}
close(OUTPUTA);
close(OUTPUTB)
Still has no luck on it :(
Here is the way I think ikegami was saying, but I've never tried this before (although it gave the correct results).
#!/usr/bin/perl
use strict;
use warnings;
open my $jpn_fh, ">", 'o33.txt' or die $!;
open my $other_fh, ">", 'o44.txt' or die $!;
my $fh;
while (<DATA>) {
if (/^NAME/) {
if (/JPN$/) {
$fh = $jpn_fh;
}
else {
$fh = $other_fh;
}
}
print $fh $_;
}
close $jpn_fh or die $!;
close $other_fh or die $!;
__DATA__
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
You didn't say what you needed help with, so I'm assuming it's coming up with an algorithm. Here's a good one:
Open the file to read.
Open the file for the JPN entries.
Open the file for the non-JPN entries.
While not eof,
Read a line.
Parse the line.
If it's the first line of a record,
If the person's country is JPN,
Set current file handle to the file handle for JPN entries.
Else,
Set current file handle to the file handle for non-JPN entries.
Print the line to the current file handle.
my $jpn_qfn = '...';
my $other_qfn = '...';
open(my $jpn_fh, '>', $jpn_qfn)
or die("Can't create $jpn_qfn: $!\n");
open(my $other_fh, '>', $other_qfn)
or die("Can't create $other_qfn: $!\n");
my $fh;
while (<>) {
chomp;
my #fields = split /\|/;
if ($fields[0] eq 'NAME') {
$fh = $fields[3] eq 'JPN' ? $jpn_fh : $other_fh;
}
say $fh $_;
}
#!/usr/bin/env perl
use 5.012;
use autodie;
use strict;
use warnings;
# store per country output filehandles
my %output;
# since this is just an example, read from __DATA__ section
while (my $line = <DATA>) {
# split the fields
my #cells = split /[|]/, $line;
# if first field is NAME, this is a new record
if ($cells[0] eq 'NAME') {
# get the country code, strip trailing whitespace
(my $country = $cells[3]) =~ s/\s+\z//;
# if we haven't created and output file for this
# country, yet, do so
unless (defined $output{$country}) {
open my $fh, '>', "$country.out";
$output{$country} = $fh;
}
my $out = $output{$country};
# output this and the next two lines to
# country specific output file
print $out $line, scalar <DATA>, scalar <DATA>;
}
}
close $_ for values %output;
__DATA__
NAME|JOHN|TOKYO|JPN
AGE|32|M
INFO|SINGLE|PROFESSIONAL|IT
NAME|MARK|MANILA|PH
AGE|37|M
INFO|MARRIED|PROFESSIONAL|BPO
NAME|SAMANTHA|SYDNEY|AUS
AGE|37|F
INFO|MARRIED|PROFESSIONAL|OFFSHORE
NAME|LUKE|TOKYO|JPN
AGE|27|M
INFO|SINGLE|PROFESSIONAL|IT
Thanks for your Help heaps
I was able to solved this problem in perl,
many thanks
#!/usr/local/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Carp qw(croak);
my #fields;
my $tmp_var;
my ($rec_type, $country);
my $filename = 'data.txt';
open (my $input_fh, '<', $filename ) or croak "Can't open $filename: $!";
open my $OUTPUTA, ">", 'o33.txt' or die $!;
open my $OUTPUTB, ">", 'o44.txt' or die $!;
my $Combline;
while (<$input_fh>) {
$_ = _trim($_);
#fields = split (/\|/, $_);
$rec_type = $fields[0];
$country = $fields[3];
if ($rec_type eq 'NAME') {
if ($country eq 'JPN') {
*Combline = $OUTPUTA;
}
else {
*Combline = $OUTPUTB;
}
}
print Combline;
}
close $OUTPUTA or die $!;
close $OUTPUTB or die $!;
sub _trim {
my $word = shift;
if ( $word ) {
$word =~ s/\s*\|/\|/g; #remove trailing spaces
$word =~ s/"//g; #remove double quotes
}
return $word;
}