I have 5 files containing the same words. I want to read each word in all the files and decide the winning word by detecting the following characters in a word (*, #, $, &) separated by tabs. Then, I want to generate an output file. Ii can only have 2 winners. For example:
file1
we$
are*
...
file2
we$
are#
...
file3
we&
are*
...
file4
we$
are#
...
file5
we$
are&
...
output file:
we$
are*#
Here is how I started:
#!/usr/local/bin/perl -w
sub read_file_line {
my $fh = shift;
if ($fh and my $line = <$fh>) {
chomp($line);
return $line;
}
return;
}
open(my $f1, "words1.txt") or die "Can't";
open(my $f2, "words2.txt") or die "Can't";
open(my $f3, "words3.txt") or die "Can't";
open(my $f4, "words4.txt") or die "Can't";
open(my $f5, "words5.txt") or die "Can't";
my $r1 = read_file_line($f1);
my $r2 = read_file_line($f2);
my $r3 = read_file_line($f3);
my $r4 = read_file_line($f4);
my $r5 = read_file_line($f5);
while ($f5) {
#What can I do here to decide and write the winning word in the output file?
$r1 = read_file_line($f1);
$r2 = read_file_line($f2);
$r3 = read_file_line($f3);
$r4 = read_file_line($f4);
$r5 = read_file_line($f5);
}
Test Data Generator
#!/usr/bin/env perl
use strict;
use warnings;
foreach my $i (1..5)
{
my $file = "words$i.txt";
open my $fh, '>', $file or die "Failed to open $file for writing ($!)";
foreach my $w (qw (we are the people in charge and what we say goes))
{
my $suffix = substr('*#$&', rand(4), 1);
print $fh "$w$suffix\n";
}
}
Majority Voting Code
#!/usr/bin/env perl
use strict;
use warnings;
my #files = ( "words1.txt", "words2.txt", "words3.txt",
"words4.txt", "words5.txt"
);
my #fh;
{
my $n = 0;
foreach my $file (#files)
{
open my $f, '<', $file or die "Can't open $file for reading ($!)";
$fh[$n++] = $f;
}
}
while (my $r = process_line(#fh))
{
print "$r\n";
}
sub process_line
{
my(#fhlist) = #_;
my %words = ();
foreach my $fh (#fhlist)
{
my $line = <$fh>;
return unless defined $line;
chomp $line;
$words{$line}++;
}
my $combo = '';
foreach my $word (keys %words)
{
return $word if ($words{$word} > 2);
$combo .= $word if ($words{$word} == 2);
}
$combo =~ s/(\W)\w+(\W)/$1$2/;
return $combo;
}
Example Data and Results
$ perl datagenerator.pl
$ perl majorityvoter.pl > results.txt
$ paste words?.txt results.txt
we* we$ we& we# we# we#
are* are# are# are* are$ are*#
the* the& the# the# the& the&#
people& people& people$ people# people# people&#
in# in* in$ in* in* in*
charge* charge# charge& charge* charge# charge#*
and$ and* and$ and& and$ and$
what& what& what$ what& what# what&
we# we* we* we& we* we*
say$ say& say$ say$ say$ say$
goes$ goes& goes# goes# goes# goes#
$
This seems to be correct for the test data in the files generated.
Revised requirements - example output
The 'revised requirements' replaced the '*#$&' markers after the words with a tab and one of the letters 'ABCD'. After some swift negotiation, the question is restored to its original form. This output is from a suitably adapted version of the answer above - 3 code lines changed, 2 in the data generator, 1 in the majority voter. Those changes are not shown - they are trivial.
we C we D we C we C we D we C
are C are D are C are B are A are C
the B the D the A the A the D the A|D
people D people B people A people B people D people B|D
in D in B in C in B in D in D|B
charge C charge D charge D charge D charge A charge D
and A and B and C and C and B and B|C
what B what B what B what C what C what B
we D we B we D we B we A we B|D
say D say D say B say D say D say D
goes A goes C goes A goes C goes A goes A
Revised test generator - for configurable number of files
Now that the poster has worked out how to handle the revised scenario, this is the data generator code I used - with 5 tags (A-E). Clearly, it would not take a huge amount of work to configure the number of tags on the command line.
#!/usr/bin/env perl
use strict;
use warnings;
my $fmax = scalar(#ARGV) > 0 ? $ARGV[0] : 5;
my $tags = 'ABCDE';
my $ntags = length($tags);
my $fmt = sprintf "words$fmax-%%0%0dd.txt", length($fmax);
foreach my $fnum (1..$fmax)
{
my $file = sprintf $fmt, $fnum;
open my $fh, '>', $file or die "Failed to open $file for writing ($!)";
foreach my $w (qw(We Are The People In Charge And What We Say Goes))
{
my $suffix = substr($tags, rand($ntags), 1);
print $fh "$w\t$suffix\n";
}
}
Revised Majority Voting Code - for arbitrary number of files
This code works with basically arbitrary numbers of files. As noted in one of the (many) comments, it does not check that the word is the same in each file as required by the question; you could get quirky results if the words are not the same.
#!/usr/bin/env perl
use strict;
use warnings;
my #files = scalar #ARGV > 0 ? #ARGV :
( "words1.txt", "words2.txt", "words3.txt",
"words4.txt", "words5.txt"
);
my $voters = scalar(#files);
my #fh;
{
my $n = 0;
foreach my $file (#files)
{
open my $f, '<', $file or die "Can't open $file for reading ($!)";
$fh[$n++] = $f;
}
}
while (my $r = process_line(#fh))
{
print "$r\n";
}
sub process_line
{
my(#fhlist) = #_;
my %words = ();
foreach my $fh (#fhlist)
{
my $line = <$fh>;
return unless defined $line;
chomp $line;
$words{$line}++;
}
return winner(%words);
}
# Get tag X from entry "word\tX".
sub get_tag_from_word
{
my($word) = #_;
return (split /\s/, $word)[1];
}
sub winner
{
my(%words) = #_;
my $maxscore = 0;
my $winscore = ($voters / 2) + 1;
my $winner = '';
my $taglist = '';
foreach my $word (sort keys %words)
{
return "$word\t$words{$word}" if ($words{$word} >= $winscore);
if ($words{$word} > $maxscore)
{
$winner = $word;
$winner =~ s/\t.//;
$taglist = get_tag_from_word($word);
$maxscore = $words{$word};
}
elsif ($words{$word} == $maxscore)
{
my $newtag = get_tag_from_word($word);
$taglist .= "|$newtag";
}
}
return "$winner\t$taglist\t$maxscore";
}
One Example Run
After considerable experimentation on the data presentation, one particular set of data I generated gave the result:
We A|B|C|D|E 2 B C C E D A D A E B
Are D 4 C D B A D B D D B E
The A 5 D A B B A A B E A A
People D 4 E D C D B E D D B C
In D 3 E C D D D B C A A B
Charge A|E 3 E E D A D A B A E B
And E 3 C E D D C A B E B E
What A 5 B C C A A A B A D A
We A 4 C A A E A E C D A E
Say A|D 4 A C A A D E D A D D
Goes A 3 D B A C C A A E E B
The first column is the word; the second is the winning tag or tags; the third (numeric) column is the maximum score; the remaining 10 columns are the tags from the 10 data files. As you can see, there two each of 'We A', 'We B', ... 'We E' in the first row. I've also generated (but not preserved) one result set where the maximum score was 7. Given enough repetition, these sorts of variations are findable.
Sounds like the job for a hash of hashes. Untested code:
use strict;
use warnings;
use 5.010;
use autodie;
use List::Util qw( sum reduce );
my %totals;
my #files = map "words$_.txt", 1..5;
for my $file (#files) {
open my $fh, '<', $file;
while (<$fh>) {
chomp;
my ($word, $sign) = /(\w+)(\W)/;
$totals{$word}{$sign}++;
}
}
open my $totals_fh, '>', 'outfile.txt';
my #sorted_words = sort { sum values %{$totals{$a}} <=> sum values %{$totals{$b}} } keys %totals; #Probably something fancier here.
for my $word (#sorted_words[0, 1]) {
#say {$totals_fh} $word, join('', keys %{$totals{$word}} ), "\t- ", function_to_decide_text($totals{$word});
say {$totals_fh} $word, reduce {
$totals{$word}{ substr $a, 0, 1 } == $totals{$word}{$b} ? $a . $b
: $totals{$word}{ substr $a, 0, 1 } > $totals{$word}{$b} ? $a
: $b;
} keys %{ $totals{$word} };
}
EDIT: Forgot about the only two winners part. Fixed, somewhat.
EDIT2: Fixed as per comments.
#!/usr/bin/perl
use strict;
use warnings;
my #files = qw(file1 file2 file3 file4 file5);
my $symbols = '*#$&'; # no need to escape them as they'll be in a character class
my %words;
foreach my $file (#files) {
open(my $fh, '<', $file) or die "Cannot open $file: $!";
while (<$fh>) {
if (/^(\w+[$symbols])$/) {
$words{$1} ++; # count the occurrences of each word
}
}
close $fh;
}
my $counter = 0;
my $previous = -1;
foreach my $word (sort {$words{$b} <=> $words{$a}} keys %words) {
# make sure you don't exit if two words at the top of the list
# have the same number of occurrences
if ($previous != $words{$word}) {
last if $counter > 1;
}
$counter ++; # count the output
$previous = $words{$word};
print "$word occurred $words{$word} times.\n";
}
Worked when I tried it out...
Related
I have a file like below
1 B B C D B
2 K B D D K
1 B B C D B
2 K B D D K
1 B B C D B
2 K B D D K
I want the output to look like this
1 B C D
2 K B D
1 B C D
2 K B D
1 B C D
2 K B D
Sort commands don't work, so I tried writing a Perl program -
use strict;
use Data::Dumper;
my $file1 = <$ARGV[0]>;
open (IF2, "$file1") || die "Cannot open the file\n";
open (OUT, ">$file1.out") || die "Cannot open the out file\n";
my $k = 0;my $i=0;
my #line;
my $m;
my #line2;
while ( chomp($m = <IF2>) ) {
my $count = 0;
#line2 = split(/\t/,$m);#<stdin>;
my $l = length #line2;print $l;<stdin>;
for (my $x = 0; $x < $l;$x++) {
my $k = 0;
for (my $y = 0;$y < $l; $y) {
$i++;
#
if ($count == 0)
{
print OUT "\t$line2[$x]";
$count++;
}
if ($count != 0 && $x != $y)
{
if ($line2[$x] eq $line2[$y])
{
$k++;
}
}
}
if ($k == 0)
{
print OUT "\t$line2[$x]";
}
}
print OUT "\n";
}
print $i;
close IF2;
close OUT;
But it didn't work.
Can some one help?
Note The input and output examples in the question were edited: now the data is consistent with the title, asking to remove all duplicates. (See the original question for how it looked.) I am leaving the answer to the original question as well, until we hear from the poster.
I would like to make a general remark first.
Your posted code is an honest attempt to write a C-style program in Perl. I recommend to resist that and to learn how to use Perl instead. It does take a little investment of time and effort but it goes quick and pays off quick. As you can see below, it makes many things incomparably easier.
Even so, there are issues with the posted code but I can't write a code review now.
Edited question
We need to remove all duplicates from each line. One way: split the line by whitespace then remove duplicates from the list, which is a standard task with ready solutions.
use warnings;
use strict;
use List::MoreUtils qw(uniq);
my $file = '...';
my $fileout = '...';
open my $fh, '<', $filen or die "Can't open $file: $!";
open my $fh_out, '>', $fileout or die "Can't open $fileout: $!";
while (<$fh>)
{
my #unique = uniq split;
print $fh_out "#unique\n";
}
close $fh;
close $fh_out;
The line with spaces is restored by printing the quoted array, when spaces (or whatever is in $") are inserted between elements . The other option is to join the resulting list
my $res = join ' ', uniq split;
print $fh_out $res, "\n";
or just print $fh_out join(' ', uniq split), "\n";.
This uses uniq from List::MoreUtils module. Note from uniq that
The order of elements in the returned list is the same as in LIST.
Once List::MoreUtils got mentioned, take note of the core module List::Util as well.
A one-liner version
perl -MList::MoreUtils=uniq -lne'print join " ", uniq split' input > output
or
perl -MList::MoreUtils=uniq -lane'print join " ", uniq #F' input > output
See Command switches in perlrun
Original question (see it in the edit history)
I was answering the question with the belief that this was the input
1 B B C D B 2 K B D D K 1 B B C D B 2 K B D D K 1 B B C D B 2 K B D D K
and this was the desired output
1 B C D 2 K B D 1 B C D 2 K B D 1 B C D 2 K B D
By your desired output you want to remove only adjacent duplicates (not get "uniq" as the title says).
For that you can use the ability of regex to match repeated patterns, by using backreferences. First we need to strip all spaces, and we'll put them back in the end. For example
use warnings;
use strict;
my $file = '...';
my $fileout = '...';
open my $fh, '<', $filen or die "Can't open $file: $!";
open my $fh_out, '>', $fileout or die "Can't open $fileout: $!";
while (my $line = <$fh>)
{
$line =~ s/\s*//g; # remove spaces /
$line =~ s/(.)\1+/$1/g; # remove adjacent duplicates
$line =~ s/(.)/$1 /g; # restore space
print $fh_out $line;
}
close $fh;
close $fh_out;
The . matches any character, replace it with something more restrictive if needed (for example \w, for a "word" character). See perlretut. Note that we cannot restore space along with replacement (like s/(.)\1+/$1 /g) since non-repeating chars aren't matched and don't get the space back.
This can be done in more concise ways.
A one-liner version
perl -pe's/\s*//g; s/(.)\1+/$1/g; s/(.)/$1 /g' input > output
See Command switches in perlrun
I suggest something like this. It finds all occurrences of whitespace followed by non-whitespace, and checks whether the non-whitespace has been seen before on the current line. The matched substring is deleted if the non-whitespace has been seen before, otherwise it is left untouched
use strict;
use warnings 'all';
while ( <DATA> ) {
my %seen;
s/(\s+(\S+))/ $seen{$2}++ ? '' : $1 /eg;
print;
}
__DATA__
1 B B C D B
2 K B D D K
1 B B C D B
2 K B D D K
1 B B C D B
2 K B D D K
output
1 B C D
2 K B D
1 B C D
2 K B D
1 B C D
2 K B D
This may be done in a one-liner like this
perl -pe 'my %s; s/(\s+(\S+))/ $s{$2}++ ? "" : $1 /eg' myfile
I'm trying to improve my script in which I hope to match characters in input.txt (column 4: H1, 2HB, CA, HB3) to dictionary.txt and replace with appropriate characters from dictionary.txt (column 2: H, HB, C, 3HB). Using dictionary.txt as a dictionary:
input.txt
1 N 22 H1 MET
1 H 32 2HB MET
1 C 40 CA MET
2 H 35 HB3 ASP
dictionary.txt
MET H H1
MET HB 2HB
MET C CA
ASP 3HB HB3
output
1 N 22 H MET
1 H 32 HB MET
1 C 40 C MET
2 H 35 3HB ASP
I'm trying to approach this by first matching the word in input.txt (MET) and dictionary.txt (MET) and then performing the substitution. This is what I've written so far:
#!/usr/bin/perl
use strict;
use warnings;
my %dictionary;
open my $dic_fh, '<', 'dictionary.txt' or die "Can't open file: $!";
while (my $ref = <$dic_fh>) {
chomp $ref;
my #columns = split(/\t/, $ref);
my $res_name = $columns[0];
my $ref_nuc = $columns[1];
$dictionary{$res_name} = {$ref_nuc};
open my $in_fh, '<', 'input.txt' or die "Can't open file: $!";
while (my $line = <$in_fh>) {
chomp $line;
my #columns = split(/\t/, $line);
my #name = $columns[3];
if (my $name eq $res_name) {
my $line = $_;
foreach my $res_name (keys %dictionary) {
$line =~ s/$name/$dictionary{$ref_nuc}/;
}
print $line;
}
}
}
The problem seems to be that you are assigning the single field $columns[3] to array #name, and then expecting to find it in $name, which is a separate variable altogether. You even declare $name at the point of the comparison
You are also executing the statement
$line =~ s/$name/$dictionary{$ref_nuc}/;
once for each key in the hash. That is unnecessary: it needs to be done only once. It is also better to change the value of $columns[3] to $dictionary{$columns[3]} instead of doing a search and replace on the whole line, as the target string may appear in other columns that you don't want to modify
It is very simple to do by building a dictionary hash and replacing the fourth field of the input file with its dictionary lookup
use strict;
use warnings;
use 5.010;
use autodie;
open my $fh, '<', 'dictionary.txt';
my %dict;
while ( <$fh> ) {
my ($k, $v) = (split)[2,1];
$dict{$k} = $v;
}
open $fh, '<', 'input.txt';
while ( <$fh> ) {
my #fields = split;
$fields[3] = $dict{$fields[3]};
say join "\t", #fields;
}
output
1 N 22 H MET
1 H 32 HB MET
1 C 40 C MET
2 H 35 3HB ASP
I tried to summarize the following table based a the same features in column 1:
infile:
A m
A m
A n
A n
A m
A c
A m
A i
A n
A n
B n
B n
B n
B n
B n
B n
C o
C i
C q
I wrote the following code, but I do not know why it does not report the last feature
perl code.pl 1 2 infile
use warnings;
use strict;
my $col_feature = $ARGV[0];
my $col_to_be_collapsed = $ARGV[1];
my $infile = $ARGV[2];
open( my $fh1, "<$infile" );
my $temp;
my $line_count = 0;
my %count = ();
my #array = ();
while ( my $line = <$fh1> ) {
chomp($line);
my #line = split( "\t| ", $line );
my $to_be_collapsed = $line[ $col_to_be_collapsed - 1 ];
my $feature = $line[ $col_feature - 1 ];
if ( $line_count >= 1 && $temp ne '' ) {
my #temp = split( "\t| ", $temp );
my $to_be_collapsed_temp = $temp[ $col_to_be_collapsed - 1 ];
my $feature_temp = $temp[ $col_feature - 1 ];
if ( $feature_temp eq $feature ) {
push( #array, $to_be_collapsed );
}
else {
map { $count{$_}++ } #array;
print "$feature_temp:\t";
print "$_:$count{$_}\t" foreach sort { $a cmp $b } keys %count;
%count = ();
#array = ();
$temp = $line;
push( #array, $to_be_collapsed );
print "\n";
}
}
else {
$temp = $line;
push( #array, $to_be_collapsed );
}
$line_count++;
}
#print $temp,"\n";
output:
A: c:1 i:1 m:4 n:4
B: n:6
But there is no any report for C in the first column!!
Thanks
It will be alot easier to use a hash in this particular case as you just need to keep a counter.
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
#open my $fh, '<', 'infile'; # Uncomment for live file.
my $fh = \*DATA; # For testing only.
my %counter;
while (<$fh>) {
my ( $outerkey, $innerkey ) = split;
$counter{$outerkey}{$innerkey}++;
}
for my $outerkey ( sort keys %counter ) {
print "$outerkey:";
print "\t$_:$counter{$outerkey}{$_}" for sort keys %{ $counter{$outerkey} };
print "\n";
}
__DATA__
A m
A m
A n
A n
A m
A c
A m
A i
A n
A n
B n
B n
B n
B n
B n
B n
C o
C i
C q
Output:
A: c:1 i:1 m:4 n:4
B: n:6
C: i:1 o:1 q:1
I have these 2 text files and I would like to find any mismatch on 2nd column between files. The mismatch to be identified is based on type of F ,P and N regardless which lines they occur. I have 1F, 3P in first file while 2P,1N and 1F in second file. When do comparison, both files should have equal occurrence of type 1F, 3P and 1N.
Text1:
f0x11 F
f0x34 P
drx99
dex67 P
edx43 P
sdx33
Text2:
1 P
2 N
4
5 F
6
7 P
Expected Output:
Text 1 has missing type of N
Text 2 has missing type of P
What I have tried so far does not produce desired output.
code:
use strict;
my %ref_data;
my %ref_data2;
open my $fh, '<', 'Text1' or die "Could not open file to read:$!";
while (<$fh>) {
chomp;
my ($res, $type) = split;
if (defined $type){
$ref_data{$type} = "$type";
}
}
our ($data,$data2);
open $fh, '<', 'Text2' or die "Could not open file to read:$!";
while (<$fh>) {
chomp;
my ($res, $type) = split;
if (defined $type){
$ref_data2{$type}= "$type";
$data2= $ref_data2{$type};
$data = $ref_data{$type};
print "File 2 has missing type of $type\n" unless $data;
}
}
foreach ($data){
print "File 1 has missing type of $_\n" if $data ne $data2;
}
You appear to want to keep track of how many times the values in Column 2 occur within each file -- for example, in a comment you write, "I have 1F, 3P in first file while 2P,1N and 1Fin second file". If that's the case, you need a better data structure.
Specifically, one that counts occurrences of the values in Column 2, and you need those counts to be tracked separately for each file. That suggests a hash-of-hashes.
use strict;
use warnings;
# Example usage:
# perl YOUR_SCRIPT.pl a.txt b.txt
my #files = #ARGV;
# Count the values in Column 2, organizing the tallies like this:
# $tallies{COL_2}{FILE_NAME} = N
my %tallies;
while (<>) {
my #cols = split;
$tallies{$cols[1]}{$ARGV} ++ if #cols > 1;
}
# Print discrepancies.
for my $c2 (keys %tallies) {
my #t = map { $tallies{$c2}{$_} || 0 } #files;
next if $t[0] == $t[1];
print "$c2: $files[0] has $t[0]; $files[1] has $t[1]\n";
}
Example output:
N: a.txt has 0; b.txt has 1
P: a.txt has 3; b.txt has 2
Also worth noting: this code does not have to open any files explicitly, and file names are not hard-coded in the program. Instead, we pass input file names as command-line arguments, get those arguments via #ARGV, process lines in those files via <>, and know which file we're currently processing via $ARGV.
I've refactored your code where you seem to be duplicating the same behavior.
The output isn't to spec, but should be clear enough for you to understand and finish up yourself.
I added a close $fh; and use warnings; as well
#!/usr/bin/perl
use strict;
use warnings;
#run
my %max; # hash of combined data
my $file_data_1 = parse_file_into_hash("text1", \%max);
my $file_data_2 = parse_file_into_hash("text2", \%max);
diff_hashes(\%max, $file_data_1, $file_data_2);
# diff_hashes($max, $h1, $h2)
#
# diffs 2 hash refs against a combined $max hash and prints results
sub diff_hashes {
my ($max, $h1, $h2) = #_;
# TODO - do all the comparisios and some error checking (if keys exist etc...) here
for my $key (keys %$max) {
print "max/combined: $key = $max->{$key}\n";
my $h1_print = exists $h1->{$key} ? $h1->{$key} : "0";
my $h2_print = exists $h2->{$key} ? $h2->{$key} : "0";
print "h1: $key = $h1_print\n";
print "h2: $key = $h2_print\n";
}
}
# parse_file_into_hash($file, $max)
#
# $max is a hash reference (passed by reference) so you can count occurences over
# multiple files...
# returns reference of hash ( $line_number => $data_value )
sub parse_file_into_hash {
my ($file, $max) = #_;
my %ref_data;
open my $fh, '<', $file or die "Could not open file to read:$!";
while (my $line = <$fh>) {
chomp $line;
my ($res, $type) = split /\s+/, $line;
if ($type) {
$ref_data{$type}++;
if (!exists $max->{$type} || $ref_data{$type} > $max->{$type}) {
$max->{$type} = $ref_data{$type};
}
}
}
close $fh;
return \%ref_data;
}
Output ran against your example files:
$ ./example.pl
max/combined: F = 1
h1: F = 1
h2: F = 1
max/combined: N = 1
h1: N = 0
h2: N = 1
max/combined: P = 3
h1: P = 3
h2: P = 2
A File contains (a.txt)
# this is test file
Data: 15th may 2010
Records :
a
b
c
d
g
l
just consider if i want to add new record "f" - i want addit in betwen d and g
You can use Tie::File and treat the text file as array. This example assumes that there're no blank lines and initial text:
use Tie::File;
tie my #array, 'Tie::File', 'filename' or die $!;
my $rec = 'f';
for my $i (0..$#array) {
if (($array[$i] cmp $rec) == 1) {
splice #array, $i, 0, $rec;
last
}
}
in one line:
perl -le 'print for(sort((map {chomp; $_} (<>)), "f"))' < infile > outfile
You obviously need to process the headers beforehand, but the technique is pretty clear
For example:
[dsm#localhost:~]$ perl -le 'print for(sort((map {chomp; $_;} (<>)), "f"))' <<EOP
> x
> v
> b
> m
> p
> o
> l
> j
> k
> EOP
b
f
j
k
l
m
o
p
v
x
[dsm#localhost:~]$
use Tie::File ();
use List::MoreUtils qw(lastidx);
my $new_item = 'f';
{
my #file;
tie #file, 'Tie::File', 'a.txt'
or die "Could not open 'a.txt' for reading: $!";
splice #file, (1 + lastidx { $_ lt $new_item } #file), 0, $new_item;
}