Prevent perl script from new line - perl

I have a Perl script:
$i=0;
while ( ($num = <STDIN>) =~ /\S/ ) {
push #lines, $num; $i++;
print ("$num"x"$i")."\n";
}
It prints this:
3
3
4
4
4
5
5
5
5
But I want it to print this:
3
3
4
4 4
5
5 5 5
How can I prevent Perl from printing a new line after every print?
I have tried this method, as you can see in the code snippet:
$num x $i

You probably need chomp($num); which will remove your input newline at the end of $num.
my $i=0;
while ( (my $num = <STDIN>) =~ /\S/ ) {
chomp($num);
$i++;
print "$num " x $i, "\n"
}
Or you could just:
print "$& " x ++$i, "\n" while <STDIN>=~/\d+/;
(Also, when asking code questions you should strip down your example to only contain what is relevant. Your push #lines, $num can only contribute to confuse potential answerers)

Related

Perl program to look for k-mer with specific sequence

I am trying to enhance a perl program I have previously written so that it recognizes top 1000 length 23 k-mers that ends with GG and print out the k-mers that only appears once in the sequence. However, no matter where I add the reg exp, I am unable to get the expected result.
The code I have:
#!/usr/bin/perl
use strict;
use warnings;
my $k = 23;
my $input = 'Fasta.fasta';
my $output = 'Fasta2.fasta';
my $match_count = 0;
#Open File
unless ( open( FASTA, "<", $input ) ) {
die "Unable to open fasta file", $!;
}
#Unwraps the FASTA format file
$/ = ">";
#Separate header and sequence
#Remove spaces
unless ( open( OUTPUT, ">", $output ) ) {
die "Unable to open file", $!;
}
<FASTA>; # discard 'first' 'empty' record
my %seen;
while ( my $line = <FASTA> ) {
chomp $line;
my ( $header, #seq ) = split( /\n/, $line );
my $sequence = join '', #seq;
for ( length($sequence) >= $k ) {
$sequence =~ m/([ACTG]{21}[G]{2})/g;
for my $i ( 0 .. length($sequence) - $k ) {
my $kmer = substr( $sequence, $i, $k );
##while ($kmer =~ m/([ACTG]{21}[G]{2})/g){
$match_count = $match_count + 1;
print OUTPUT ">crispr_$match_count", "\n", "$kmer", "\n" unless $seen{$kmer}++;
}
}
}
The input fasta file looks like this:
> >2L type=chromosome_arm; loc=2L:1..23011544; ID=2L; dbxref=REFSEQ:NT_033779,GB:AE014134; MD5=bfdfb99d39fa5174dae1e2ecd8a231cd; length=23011544; release=r5.54; species=Dmel;
CGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTGCCTCTCAT
TTTCTCTCCCATATTATAGGGAGAAATATGATCGCGTATGCGAGAGTAGT
GCCAACATATTGTGCTCTTTGATTTTTTGGCAACCCAAAATGGTGGCGGA
TGAACGAGATGATAATATATTCAAGTTGCCGCTAATCAGAAATAAATTCA
TTGCAACGTTAAATACAGCACAATATATGATCGCGTATGCGAGAGTAGTG
CCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTATATTACCG
CAAACCCAAAAAGACAATACACGACAGAGAGAGAGAGCAGCGGAGATATT
TAGATTGCCTATTAAATATGATCGCGTATGCGAGAGTAGTGCCAACATAT
TGTGCTCTCTATATAATGACTGCCTCTCATTCTGTCTTATTTTACCGCAA
ACCCAAATCGACAATGCACGACAGAGGAAGCAGAACAGATATTTAGATTG
CCTCTCATTTTCTCTCCCATATTATAGGGAGAAATATGATCGCGTATGCG
AGAGTAGTGCCAACATATTGTGCTCTTTGATTTTTTGGCAACCCAAAATG
GTGGCGGATGAACGAGATGATAATATATTCAAGTTGCCGCTAATCAGAAA
TAAATTCATTGCAACGTTAAATACAGCACAATATATGATCGCGTATGCGA
GAGTAGTGCCAACATATTGTGCTAATGAGTGCCTCTCGTTCTCTGTCTTA
TATTACCGCAAACCCAAAAAGACAATACACGACAGAGAGAGAGAGCAGCG
GAGATATTTAGATTGCCTATTAAATATGATCGCGTATGCGAGAGTAGTGC
CAACATATTGTGCTCTCTATATAATGACTGCCTCTCATTCTGTCTTATTT
TACCGCAAACCCAAATCGACAATGCACGACAGAGGAAGCAGAACAGATAT
and so on...
The expected outcome (print out the 23k-mers with GG ending that only appear once in the sequence) I am hoping to get:
>crispr_1
GGGTGGAGCTCCCGAAATGCAGG
>crispr_2
TTAATAAATATTGACACAGCGGG
>crispr_3
ATCGTGGGGCGTTTTGTGAAAGG
>crispr_4
AGTTTTTCACATAATCAGACAGG
>crispr_5
GTGTTGGATGAGTGTCCTCTGGG
>crispr_6
ATAGGTTGGTTGTTTTAAAAGGG
>crispr_7
AAATTTTTGTTGCCACTGAATGG
>crispr_8
AAGTTTCGAACTACGATGGTTGG
>crispr_9
CATGCTTTGTGGAAATAAGTCGG
>crispr_10
CACAGTGGGTGTTTGCACCTCGG
.... and so on
The current code I did create a fasta file with following:
>crispr_1
CGACAATGCACGACAGAGGAAGC
>crispr_2
GACAATGCACGACAGAGGAAGCA
>crispr_3
ACAATGCACGACAGAGGAAGCAG
>crispr_4
CAATGCACGACAGAGGAAGCAGA
>crispr_5
AATGCACGACAGAGGAAGCAGAA
>crispr_6
ATGCACGACAGAGGAAGCAGAAC
>crispr_7
TGCACGACAGAGGAAGCAGAACA
>crispr_8
GCACGACAGAGGAAGCAGAACAG
>crispr_9
CACGACAGAGGAAGCAGAACAGA
>crispr_10
ACGACAGAGGAAGCAGAACAGAT
.... and so on
while if I remove the
for (length($sequence) >=$k){
$sequence =~m/([ACTG]{21}[G]{2})/g;
and add the ##while ($kmer =~ m/([ACTG]{21}[G]{2})/g){
while ($kmer =~ m/([ACTG]{21}[G]{2})/g){
I am getting fasta file (with results which is not numbered correctly and unable to distinguish between duplicated and unique sequences):
>crispr_1
CATTTTCTCTCCCATATTATAGG
>crispr_2
ATTTTCTCTCCCATATTATAGGG
>crispr_3
TATTGTGCTCTTTGATTTTTTGG
>crispr_4
GATTTTTTGGCAACCCAAAATGG
>crispr_5
TTTTTGGCAACCCAAAATGGTGG
>crispr_6
TTGGCAACCCAAAATGGTGGCGG
>crispr_7
ACGACAGAGAGAGAGAGCAGCGG
>crispr_8
AAATCGACAATGCACGACAGAGG
>crispr_91
TATTGTGATCTTCGATTTTTTGG
>crispr_93
TTTTTGGCAACCCAAAATGGAGG
.... and so on
I have attempted to move the regex around my code, but none of them generated the expected result. I do not know what I did wrong over here. I have not add the exit the program when count reaches 1000 into the code yet.
Thanks in advance!
I'm not sure I understand your question completely, but could the following be what you need.
<FASTA>; # discard 'first' 'empty' record
my %data;
while (my $line = <FASTA>){
chomp $line;
my($header, #seq) = split(/\n/, $line);
my $sequence = join '', #seq;
for my $i (0 .. length($sequence) - $k) {
my $kmer = substr($sequence, $i, $k);
$data{$kmer}++ if $kmer =~ /GG$/;
}
}
my $i = 0;
for my $kmer (sort {$data{$b} <=> $data{$a}} keys %data) {
printf "crispr_%d\n%s appears %d times\n", ++$i, $kmer, $data{$kmer};
last if $i == 1000;
}
Some output on a file I have is:
crispr_1
ggttttccggcacccgggcctgg appears 4 times
crispr_2
ccgagctgggcgagaagtagggg appears 4 times
crispr_3
gccgagctgggcgagaagtaggg appears 4 times
crispr_4
gcacccgggcctgggtggcaggg appears 4 times
crispr_5
agcagcgggatcgggttttccgg appears 4 times
crispr_6
gctgggcgagaagtaggggaggg appears 4 times
crispr_7
cccttctgcttcagtgtgaaagg appears 4 times
crispr_8
gtggcagggaagaatgtgccggg appears 4 times
crispr_9
gatcgggttttccggcacccggg appears 4 times
crispr_10
tgagggaaagtgctgctgctggg appears 4 times
crispr_11
agctgggcgagaagtaggggagg appears 4 times
. . . .
ggcacccgggcctgggtggcagg appears 4 times
crispr_50
gaatctctttactgcctggctgg appears 4 times
crispr_51
accacaacattgacagttggtgg appears 2 times
crispr_52
caacattgacagttggtggaggg appears 2 times
crispr_53
catgctcatcgtatctgtgttgg appears 2 times
crispr_54
gattaatgaagtggttattttgg appears 2 times
crispr_55
gaaaccacaacattgacagttgg appears 2 times
crispr_56
aacattgacagttggtggagggg appears 2 times
crispr_57
gacttgatcgattaatgaagtgg appears 2 times
crispr_58
acaacattgacagttggtggagg appears 2 times
crispr_59
gaaccatatattgttatcactgg appears 2 times
crispr_60
ccacagcgcccacttcaaggtgg appears 1 times
crispr_61
ctgctcctgggtgtgagcagagg appears 1 times
crispr_62
ccatatattatctgtggtttcgg appears 1 times
. . . .
Update
To get the results you mentioned in your comment (below), replace the output code with:
my $i = 1;
while (my ($kmer, $count) = each %data) {
next unless $count == 1;
print "crispr_$i\n$kmer\n";
last if $i++ == 1000;
}
To answer my own comment to get first 1000.
<FASTA>; # discard 'first' 'empty' record
my %seen;
my #kmers;
while (my $line = <FASTA>){
chomp $line;
my($header, #seq) = split(/\n/, $line);
my $sequence = join '', #seq;
for my $i (0 .. length($sequence) - $k) {
my $kmer = substr($sequence, $i, $k);
if ($kmer =~ /GG$/) {
push #kmers, $kmer unless $seen{$kmer}++;
}
}
}
my $i = 1;
for my $kmer (#kmers) {
next unless $seen{$kmer} == 1;
print "crispr_$i\n$kmer\n";
last if $i++ == 1000;
}
Answer To check for uniqueness of final 12 chars ending in GG, the code below achieves that.
if ($kmer =~ /(.{10}GG)$/) {
my $substr = $1;
push #kmers, $kmer unless $seen{$substr}++;
}
my $i = 1;
for my $kmer (#kmers) {
my $substr = substr $kmer, -12;
next unless $seen{$substr} == 1;
print "crispr_$i\n$kmer\n";
last if $i++ == 1000;
}
Actually, this code line
$sequence =~m/([ACTG]{21}[G]{2})/g;
this line is just for the regex match, if you try to print this $sequence, it will surely print out the original result.
please add the code segement like this
if($sequence =~/([ACTG]{21}[G]{2}$)/g)
{
}#please remember to match the end with $.
BTW,It looks like the multiple for loop to parse this data is not very reasonable, the parse speed is without the best-efficiency.

Issue with nested loop

I got file called numbers.txt which is basically line with 5 numbers:
they look like this:
1 2 3 4 5
What I'm trying to achieve is I want to read those numbers from the line (which already works), then in each iteration I want to add +1 to every number which was read from that file and print them on screen with print, so the final result should look like:
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
.
#!/usr/bin/perl
use strict;
use warnings;
open("handle", 'numbers.txt') or die('unable to open numbers file\n');
$/ = ' ';
OUT: for my $line (<handle>) {
for (my $a = 0; $a < 5; $a++) {
chomp $line;
$line += 1;
print "$line ";
next OUT;
}
}
close("handle");
Haven't done looping in perl for a while now and would be great if someone could provide working example.
Also, it would be great if you could provide more than one working example, just to be future proof ;)
Thanks
You can try this on for size.
#!/usr/bin/perl
use strict;
use warnings;
open("handle", 'numbers.txt') or die('unable to open numbers file\n');
for my $line (<handle>) {
chomp $line;
for my $number (split /\s+/, $line) {
for (my $a = $number; $a < $number+5; $a++) {
print "$a ";
}
print "\n";
}
}
close("handle");
You can dispense with $/=' ' and instead let the outer loop iterate on lines of the file.
For each line you want to iterate for each number which is separated by white space, thus the split /\s+/, $line which gives you a list of numbers for the inner loop.
For your output $a starts at the number read from the file.
This will do what you're after:
use strict;
use warnings;
while(<DATA>) {
chomp;
print "$_\n";
my #split = split;
my $count = 0;
for (1..4){
$count++;
foreach (#split){
my $num = $_ + $count;
print "$num ";
}
print "\n";
}
}
__DATA__
1 2 3 4 5
Here no need to use nested loop it's always program make slower.
#!/usr/bin/perl
use strict;
use warnings;
my #num = split(" ",(<DATA>)[0]);
foreach my $inc (0..$#num)
{
print map{$inc+$_," "}#num; # Add one by one in array element
print "\n";
}
__DATA__
1 2 3 4 5
Update Added another method, this one in line with the posted approach.
Increment each number in the string, changing the string in place. Repeat that. Below are two ways to do that. Yet another method reads individual numbers and prints following integer sequences.
(1) With regular expressions. It also fits in one-liner
echo "1 2 3 4 5" | perl -e '$v = <>; for (1..5) { print $v; $v =~ s/(\d+)/$1+1/eg; }'
This prints the desired output. But better put it in a script
use warnings;
use strict;
my $file = 'numbers.txt';
open my $fh, '<', $file or die "can't open $file: $!";
while (my $line = <$fh>) {
# Add chomp($line) if needed for some other processing.
for (1..5) {
print $line;
$line =~ s/(\d+)/$1+1/eg;
}
}
The /e modifier is crucial for this. It makes the replacement side of the regex be evaluated as code instead of as a double-quoted string. So you can actually execute code there and here we add to the captured number, $1+1, for each matched number as /g moves down the string. This changes the string so the next iteration of the for (1..5) increments those, etc. I match multiple digits, \d+, which isn't necessary in your example but makes far more sense in general.
(2) Via split + map + join, also repeatedly changing the line in place
while (my $line = <$fh>) {
for (1..5) {
print $line;
$line = join ' ', map { $_+1 } split '\s+', $line;
}
}
The split gets the list of numbers from $line and feeds it to map, which increments each, feeding its output list to join. The joined string is assigned back to $line, and this is repeated. I split by \s+ to allow multiple white space but this makes it very 'relaxed' in what input format it accepts, see perlrecharclass. If you know it's one space please change that to ' '.
(3) Take a number at a time and print the integer sequence starting from it.
open my $fh, '<', $file or die "can't open $file: $!";
local $/ = ' ';
while (my $num = <$fh>) {
print "$_ " for $num..$num+4;
print "\n";
}
The magical 4 can be coded by pre-processing the whole line to find the sequence length, say by
my $len = () = $line =~ /(\d+)/g;
or by split-ing into an array and taking its scalar, then using $len-1.
Additional comments.
I recommend the three-argument open, open my $fh, '<', $file
When you check a call print the error, die "Your message: $!", to see the reason for failure. If you decide to quit, if ($bad) { die "Got $bad" }, then you may not need $!. But when an external call fails you don't know the reason so you need the suitable error variable, most often $!.
Your program has a number of problems. Here is what's stopping it working
You are setting the record separator to a single space. Your input file contains "1 2 3 4 5\n", so the while loop will iterate five times setting $line to "1 ", "2 ", "3 ", "4 ", "5\n"
Your for loop is set up to iterate five times. It does chomp $line which removes the space after the number, then increments $line and prints it. Then you jump out of the for loop, having executed it only once, with next OUT. This results in each value in the file being incremented by one and printed, so you get 2 3 4 5 6
Removing the unnecessary next OUT, produces something closer
2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9 6 7 8 9 10
There are now five numbers being printed for each number in the input file
Adding print "\n" after the for loop help separate the lines
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
6 7 8 9 10
Now we need to print the number before it is incremented instead of afterwards. If we swap $line += 1 and print "$line " we get this
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5
6 7 8 9
What is happening here is that the 5 is still followed be a newline, which now appears in the output. The chomp won't remove this because it removes the value of $/ from the end of a string. You've set that to a space, so it will remove only spaces. The fix is to replace chomp with a substitution s/\s+//g which removes *all whitespace from the string. You also need to do that only once so I've put it outside the for loop at the top
Now we get this
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
And this is your code as it ended up
use strict;
use warnings;
open( "handle", 'numbers.txt' ) or die('unable to open numbers file\n');
$/ = ' ';
for my $line (<handle>) {
$line =~ s/\s+//g;
for ( my $a = 0; $a < 5; $a++ ) {
print "$line ";
$line += 1;
}
print "\n";
}
close("handle");
There are a few other best practices that could improve your program
Use use warnings 'all'
Use lexical file handles, and the three-parameter form of open
Use local if you are changing Perl's built-in variables
Put $! into your die string so that you know why the open failed
Avoid the C-style for loop, and iterate over a list instead
Making these fixes as well looks like this. The output is identical to the above
use strict;
use warnings 'all';
open my $fh, '<', 'numbers.txt'
or die qq{Unable to open "numbers.txt" for input: $!};
local $/ = ' ';
for my $line ( <$fh> ) {
$line =~ s/\s+//g;
for my $a ( 0 .. 4 ) {
print "$line ";
++$line;
}
print "\n";
}

Transposing an array of elements

I am trying to transpose an array.
I tried the following code...
#! /usr/bin/perl
use strict;
use warnings;
use autodie;
open my $fh, '<',"op.txt" || die "$!";
open my $wh , '>',"pwl.txt" || die "$!";
select ($wh);
while (my $line = <$fh>) {
my #rows = $line;
my #transposed;
for my $row (#rows) {
for my $column (0 .. $#{$row}) {
push(#{$transposed[$column]}, $row->[$column]);
}
}
for my $new_row (#transposed) {
for my $new_col (#{$new_row}) {
print $new_col, " ";
}
print "\n";
}
}
**********INPUT FILE******
1 2 3
4 5 6
7 8 9
********** EXPECTED OUTPUT FILE *******
1 4 7
2 5 8
3 6 9
******** GENERATED OUTPUT FILE *******
Currently couldn't able print anything. script shows the error
"can't use string ("1 4 7") as an array ref while "strict refs" in use
Reference:
used the following reference...
Transpose in perl
however in this reference example, array input lines are declared manually where as i am trying to process a array which is in a text file
could anybody help me where i did mistake?
Many Thanks
You did have to split input line as #dland suggested. But, there were a few other issues.
Here's the corrected code [please pardon the gratuitous style cleanup]:
#! /usr/bin/perl
use strict;
use warnings;
use autodie;
open my $fh, '<',"op.txt" || die "$!";
open my $wh , '>',"pwl.txt" || die "$!";
my #rows;
while (my $line = <$fh>) {
my #line = split(" ",$line);
push(#rows,\#line);
}
close($fh);
my #transposed;
for my $row (#rows) {
push(#transposed,[]);
}
my $rowidx = -1;
for my $rowptr (#rows) {
++$rowidx;
my $colidx = -1;
for my $rowval (#$rowptr) {
++$colidx;
###printf("R=%d C=%d\n",$rowidx,$colidx);
my $colptr = $transposed[$colidx];
$colptr->[$rowidx] = $rowval;
}
}
for my $new_row (#transposed) {
for my $new_col (#$new_row) {
print $wh $new_col, " ";
}
print $wh "\n";
}
close($wh);
Note: It's slightly harder to transpose a non-square matrix. The above code may need to be extended a bit for that.
You're trying to shove a scalar into an array:
my #row = $line;
I think what you really want is to split on spaces:
my #row = split / /, $line;
adding my #row =map [ split], $line to my initial code is helping to print the data to pwl.txt.
however its not printing side by side.. instead it is printing to new line.
I guess because of this Mr.Borodin didn't include matrix in side while loop!
1
2
3
4
5
6
7
8
9
Here's a simpler way of doing this. It assumes all the rows are of the same length, and that all the lines in the file contain data -- i.e. there are no blank lines
The name of the input file is expected as a parameter on the command line, and the output is sent to STDOUT so it can be redirected on the command line. For instance
perl transpose.pl op.txt > pwl.txt
use strict;
use warnings 'all';
my #matrix = map [ split ], <>;
print "#$_\n" for #matrix;
print "\n";
my #transpose;
for my $i ( 0 .. $#{ $matrix[0] } ) {
$transpose[$i] = [ map { $_->[$i] } #matrix ]
}
print "#$_\n" for #transpose;
print "\n";
output
1 2 3
4 5 6
7 8 9
1 4 7
2 5 8
3 6 9

Read specific column in perl

I'm new in perl. I have below text file and from there I want only one Time column and next columns are values. How can I create a text file with my desire output in perl.
Time Value Time Value Time Value
1 0.353366497 1 0.822193251 1 0.780866396
2 0.168834182 2 0.865650713 2 0.42429447
3 0.323540698 3 0.865984245 3 0.856875894
4 0.721728497 4 0.634773162 4 0.563059042
5 0.545131335 5 0.029808531 5 0.645993399
6 0.143720835 6 0.949973296 6 0.14425803
7 0.414601876 7 0.53421424 7 0.826148814
8 0.194818367 8 0.942334356 8 0.837107013
9 0.291448263 9 0.242588271 9 0.939609775
10 0.500159997 10 0.428897293 10 0.41946448
I've tried below code:
use strict;
use warnings;
use IO::File;
my $result;
my #files = (q[1.txt],q[2.txt],q[3.txt]);
my #fhs = ();
foreach my $file (#files) {
my $fh = new IO::File $file, O_RDONLY;
push #fhs, $fh if defined $fh;
}
while(1) {
my #lines = map { $_->getline } #fhs;
last if grep { not defined $_ } #lines[0..(#fhs-1)];
my #result=join(qq[\t], map { s/[\r?\n]+/ /g; $_ } #lines ) . qq[\r\n];
open (MYFILE, '>>Result.txt');
print (MYFILE "#result");
close (MYFILE);
}
I'd go with split.
use warnings;
use strict;
open (my $f, '<', 'your-file.dat') or die;
while (my $line = <$f>) {
my #elems = split ' ', $line;
print join "\t", #elems[0,1,3,5];
print "\n";
}
This is a one-liner; no need to write a script:
$ perl -lanE '$,="\t"; say #F[0,1,3,5]' 1.txt 2.txt 3.txt
If you like, you can shorten it to:
$ perl -lanE '$,="\t"; say #F[0,1,3,5]' [123].txt
Right now, you're just concatenating the lines of the files together. If that doesn't give you the output you like, you need to chop some columns out.
Since your output looks like you have tab delimited files as input, I split the lines coming in by tabs. And since you only wanted the second column, I only take the column at the first offset from the split.
my $line_num = 0;
while(1) {
my #lines = map { $_->getline } #fhs;
last if grep { not defined $_ } #lines[0..$#fhs];
$line_num++;
my #rows = map { [ split /\t/ ] } #lines;
my $time_val = $rows[0][0];
die "Time values are not all equal on line #$line_num!"
if grep { $time_val != $_->[0] } #rows
;
my $result = join( q[\t], $time_val, map { $_->[1] } #rows );
open (MYFILE, '>>Result.txt');
print (MYFILE "$result\n");
close (MYFILE);
}
Of course, there is no reason to do custom coding to split delimited columns:
use Text::CSV;
...
my $csv = Text::CSV->new( { sep_char => "\t" } );
while(1) {
my #rows = map { $csv->getline( $_ ) } #fhs;
last if grep { not defined $_ } #rows[0..$#fhs];
my ( $time_val, #time_vals ) = map { $_->[0] } #rows;
my #values = map { $_->[1] } #rows;
die "Time values are not all equal on line #$line_num!"
if grep { $time_val != $_ } #time_vals
;
my $result = join( q[\t], $time_val, #values );
...
}
use strict;
use warnings;
open(FH,"<","a.txt");
print "=========== A File content =========== \n";
my $a = `cat a.txt`;
print "$a\n";
my #temp = <>;
my (#arr, #entries, #final);
foreach ( #temp ) {
#arr = split ( " ", $_ );
push #entries, #arr;
}
close FH;
my #entries1 = #entries;
for(my $i = 7; $i<=$#entries; $i=$i+2) {
push #final, $entries[$i];
}
my $size = scalar #final;
open FH1, ">", "b.txt";
print FH1 "Time \t Value\n";
for(my $i = 0; $i < $size; $i++) {
my $j = $i+1;
print FH1 "$j \t $final[$i]\n";
}
close FH1;
print "============ B file content ===============\n";
my $b = `cat b.txt`;
print "$b";
O/P:
=========== A File content ===========
Time Value Time Value Time Value
1 0.353366497 1 0.822193251 1 0.780866396
2 0.168834182 2 0.865650713 2 0.42429447
3 0.323540698 3 0.865984245 3 0.856875894
4 0.721728497 4 0.634773162 4 0.563059042
5 0.545131335 5 0.029808531 5 0.645993399
6 0.143720835 6 0.949973296 6 0.14425803
7 0.414601876 7 0.53421424 7 0.826148814
8 0.194818367 8 0.942334356 8 0.837107013
9 0.291448263 9 0.242588271 9 0.939609775
10 0.500159997 10 0.428897293 10 0.41946448
============ B file content ===============
Time Value
1 0.353366497
2 0.822193251
3 0.780866396
4 0.168834182
5 0.865650713
6 0.42429447
7 0.323540698
8 0.865984245
9 0.856875894
10 0.721728497
11 0.634773162
12 0.563059042
13 0.545131335
14 0.029808531
15 0.645993399
16 0.143720835
17 0.949973296
18 0.14425803
19 0.414601876
20 0.53421424
21 0.826148814
22 0.194818367
23 0.942334356
24 0.837107013
25 0.291448263
26 0.242588271
27 0.939609775
28 0.500159997
29 0.428897293
30 0.41946448

read multiple lines and consolidate to single line [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I want to process a file in such a way that, read from multiple lines and make a new file which will have a single line corresponding to this.
For eg.:
Input File:
12345 1 - - -
12346 - 2 - 4
12347 - - 3 -
12348 5 - - -
12349 - 6 - 8
12350 - - 7 -
Output file :
12346 1 2 3 4
12349 5 6 7 8
Take 3 lines to make a complete row.
How to do this in perl?
Do it in 3 simple steps:
Create a hash, keeping 1st word as key.
Read from files and keep the input in an array.
Print the elements to a new file against each each key.
perl -lane'
$r = $. % 3;
$f = $F[0] if $r==2;
$arr[$_] += $F[$_] for 1 .. $#F;
if (!$r) { print "$f #arr"; #arr=() }
' file
output
12346 1 2 3 4
12349 5 6 7 8
i think this will help you,if any clarification let me know:
input file is:
12345|1|-|-|-
12346|-|2|-|4
12347|-|-|3|-
12348|5|-|-|-
12349|-|6|-|8
12350|-|-|7|-
script
use strict;
use warnings;
sub processor{
(my #tmp )=#_;
my #inner;
shift(#tmp);
foreach my $element(#tmp){
$_=$element;
if((m/[0-9]/)){
push(#inner,$element);
}
}
return #inner;
}
open(INFILE,"infile.dat") or die "$!";
open(OFILE,">outfile.dat") or die "$!";
my #ecollector;
my $keyVal;
my $counter;
while(<INFILE>){
chomp($_);
my #tmp=split('\|',$_);
if(($. % 3) ne 0){
if(($. % 3) eq 2){ $keyVal=$tmp[0];}
push(#ecollector,processor(#tmp));
}
else{
push(#ecollector,processor(#tmp));
print OFILE "$keyVal\t".join("\t",sort(#ecollector))."\n";
#ecollector=();
}
}
close(INFILE);
close(OFILE);
This is my solution. input, and output files are 1st and 2nd argument from command-line:
#!/usr/bin/perl
use warnings;
use strict;
open(DATA, "<", $ARGV[0]) or die $!;
open(OUTPUT, ">", $ARGV[1]) or die $!;
my $count = 0;
my $id = 0;
my #arr;
foreach (<DATA>){
chomp;
my #data;
# if line is not empty
if($_ ne ""){
#data = split(" ", $_);
$id = $data[0];
# merge data
for(my $i = 1; $i <= $#data; $i++){
if($data[$i] ne "-"){
$arr[$i] = $data[$i];
}
}
$count++;
if ($count == 2){
$arr[0] = $id;
} elsif ($count == 3){
# reset countable variable
$count = 0;
# print output
print OUTPUT join " ",#arr,"\n";
}
}
}
close(DATA);
close(OUTPUT);