This question already has answers here:
How to declare function and use it recursively without "called too early to check prototype"
(2 answers)
Closed 1 year ago.
How to solve Perl error comes up when recursive function running
main::getE_Path_Rec() called too early to check prototype at ./test.pl line 28
shown by cat -n:
13 our ($whole, #result);
14
15 sub getE_Path_Rec ($\#$) { my ($path, #iOffNode, $offset) = #_;
16
17 $path=~ s#^/([^/]+)(.*)$#$2#;
18 my #OffNode; my $eleNow=$1;
19 for (#iOffNode) {
20 $eleNow=~ m#^([^[/,]+)(?|\[(\d+|#[^]]+)\]|,(\d+))?#;
21 #
22 if($2) {
23 getElem($1, $2-1, $_->[1], #OffNode);
24 return undef if !#OffNode;
25 if ($path) {
26 #
27 #
28 getE_Path_Rec( $path, #OffNode, $offset.${$OffNode[0]}[0])
29 }else {
30 push( #result, [$offset, ${$OffNode[0]}[1]])
31 }
36 }
37
38 return
39 }
How do we solve such in Perl up to the recursive function seamlessly ?
The docs for prototype in perlsub spell this out
... a recursive function with a prototype has to be predeclared for the prototype to take effect
So need
sub getE_Path_Rec ($\#$); # predeclare
# later
sub getE_Path_Rec ($\#$) { ... } # actual definition
Related
I am trying to perform some stack analysis on an MCU following the steps described here. The site links then to a Perl script that I launch as a post-build operation by meanings of a simple batch file.
The IDEA based on Eclipse uses the Perl executable at the path:
C:\..\S32DS_ARM_v2018.R1\utils\msys32\usr\bin\perl.exe
perl.exe -v gives:
This is perl 5, version 22, subversion 1 (v5.22.1) built for i686-msys-thread-multi-64int
The OS (windows) has a perl installation at
C:\Perl64\bin\perl.exe
perl.exe -v gives:
This is perl 5, version 24, subversion 3 (v5.24.3) built for MSWin32-x64-multi-thread
(with 1 registered patch, see perl -V for more detail)
I can confirm that avstak.pl (the perl script I am referring some lines above) produces different results with the former or the latter.
WHY this happens, is out of my area of expertise at the moment.
What I would like to understand is
Understand why this is happening;
Understand which perl provides the right outputs (pretty sure I suppose the 5.24.3 is the correct one);
Learning how to prevent this issue if I am going to use perl in future.
Thanks and best regards,
L.
Edit: the outcome of the script with the two different perl versions (reduced output for readability):
This one is result_5.22.1
Func Cost Frame Height
------------------------------------------------------------------------
> I2C_MasterGetTransferStatus 292 292 1
> FLEXIO_I2C_DRV_MasterStartTransfer 236 236 1
> CLOCK_DRV_Init 172 172 1
> CLOCK_SYS_SetConfiguration 172 172 1
> EDMA_DRV_ConfigScatterGatherTransfer 132 132 1
> CLOCK_SYS_SetSystemClockConfig 76 76 1
> FLEXIO_I2C_DRV_MasterInit 60 60 1
> EDMA_DRV_ConfigSingleBlockTransfer 60 60 1
> main 52 52 1
> LPI2C_DRV_MasterSetBaudRate 52 52 1
> LPI2C_DRV_MasterStartDmaTransfer 52 52 1
> FLEXIO_DRV_InitDriver 52 52 1
> I2C_MasterInit 44 44 1
> LPI2C_DRV_SlaveStartDmaTransfer 44 44 1
> CLOCK_SYS_UpdateConfiguration 44 44 1
> CLOCK_DRV_SetClockSource 44 44 1
> LPI2C_DRV_SlaveInit 44 44 1
> EDMA_DRV_Init 44 44 1
> EDMA_DRV_Deinit 36 36 1
> CLOCK_SYS_ConfigureSOSC 36 36 1
> CLOCK_SYS_ConfigureFIRC 36 36 1
vs
result_5.24.3
Func Cost Frame Height
------------------------------------------------------------------------
> main 536 52 9
I2C_MasterSendDataBlocking 484 28 8
> I2C_MasterReceiveDataBlocking 484 28 8
> I2C_MasterReceiveData 468 20 7
> I2C_MasterSendData 468 20 7
FLEXIO_I2C_DRV_MasterReceiveDataBlocking 456 28 7
FLEXIO_I2C_DRV_MasterSendDataBlocking 456 28 7
FLEXIO_I2C_DRV_MasterSendData 448 20 5
FLEXIO_I2C_DRV_MasterReceiveData 448 20 5
FLEXIO_I2C_DRV_MasterStartTransfer 428 236 4
> I2C_MasterGetTransferStatus 408 292 6
CLOCK_SYS_UpdateConfiguration 336 44 6
CLOCK_SYS_SetConfiguration 292 172 5
> CLOCK_DRV_Init 292 172 5
LPI2C_DRV_MasterReceiveDataBlocking 256 20 7
> I2C_SlaveReceiveDataBlocking 252 12 8
> I2C_SlaveSendDataBlocking 252 12 8
As you can see the hight number in the first version doesn't increase (and it should).
Cost and frame suffer the same issue I suppose.
the script is here:
#!/usr/bin/perl -w
# avstack.pl: AVR stack checker
# Copyright (C) 2013 Daniel Beer <dlbeer#gmail.com>
#
# Permission to use, copy, modify, and/or distribute this software for
# any purpose with or without fee is hereby granted, provided that the
# above copyright notice and this permission notice appear in all
# copies.
#
# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
# WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE
# AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL
# DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
# PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
# TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
# PERFORMANCE OF THIS SOFTWARE.
#
# Usage
# -----
#
# This script requires that you compile your code with -fstack-usage.
# This results in GCC generating a .su file for each .o file. Once you
# have these, do:
#
# ./avstack.pl <object files>
#
# This will disassemble .o files to construct a call graph, and read
# frame size information from .su. The call graph is traced to find, for
# each function:
#
# - Call height: the maximum call height of any callee, plus 1
# (defined to be 1 for any function which has no callees).
#
# - Inherited frame: the maximum *inherited* frame of any callee, plus
# the GCC-calculated frame size of the function in question.
#
# Using these two pieces of information, we calculate a cost (estimated
# peak stack usage) for calling the function. Functions are then listed
# on stdout in decreasing order of cost.
#
# Functions which are recursive are marked with an 'R' to the left of
# them. Their cost is calculated for a single level of recursion.
#
# The peak stack usage of your entire program can usually be estimated
# as the stack cost of "main", plus the maximum stack cost of any
# interrupt handler which might execute.
use strict;
# Configuration: set these as appropriate for your architecture/project.
my $objdump = "arm-none-eabi-objdump";
my $call_cost = 4;
# First, we need to read all object and corresponding .su files. We're
# gathering a mapping of functions to callees and functions to frame
# sizes. We're just parsing at this stage -- callee name resolution
# comes later.
my %frame_size; # "func#file" -> size
my %call_graph; # "func#file" -> {callees}
my %addresses; # "addr#file" -> "func#file"
my %global_name; # "func" -> "func#file"
my %ambiguous; # "func" -> 1
foreach (#ARGV) {
# Disassemble this object file to obtain a callees. Sources in the
# call graph are named "func#file". Targets in the call graph are
# named either "offset#file" or "funcname". We also keep a list of
# the addresses and names of each function we encounter.
my $objfile = $_;
my $source;
open(DISASSEMBLY, "$objdump -dr $objfile|") ||
die "Can't disassemble $objfile";
while (<DISASSEMBLY>) {
chomp;
if (/^([0-9a-fA-F]+) <(.*)>:/) {
my $a = $1;
my $name = $2;
$source = "$name\#$objfile";
$call_graph{$source} = {};
$ambiguous{$name} = 1 if defined($global_name{$name});
$global_name{$name} = "$name\#$objfile";
$a =~ s/^0*//;
$addresses{"$a\#$objfile"} = "$name\#$objfile";
}
if (/: R_[A-Za-z0-9_]+_CALL[ \t]+(.*)/) {
my $t = $1;
if ($t eq ".text") {
$t = "\#$objfile";
} elsif ($t =~ /^\.text\+0x(.*)$/) {
$t = "$1\#$objfile";
}
$call_graph{$source}->{$t} = 1;
}
}
close(DISASSEMBLY);
# Extract frame sizes from the corresponding .su file.
if ($objfile =~ /^(.*).o$/) {
my $sufile = "$1.su";
open(SUFILE, "<$sufile") || die "Can't open $sufile";
while (<SUFILE>) {
$frame_size{"$1\#$objfile"} = $2 + $call_cost
if /^.*:([^\t ]+)[ \t]+([0-9]+)/;
}
close(SUFILE);
}
}
# In this step, we enumerate each list of callees in the call graph and
# try to resolve the symbols. We omit ones we can't resolve, but keep a
# set of them anyway.
my %unresolved;
foreach (keys %call_graph) {
my $from = $_;
my $callees = $call_graph{$from};
my %resolved;
foreach (keys %$callees) {
my $t = $_;
if (defined($addresses{$t})) {
$resolved{$addresses{$t}} = 1;
} elsif (defined($global_name{$t})) {
$resolved{$global_name{$t}} = 1;
warn "Ambiguous resolution: $t" if defined ($ambiguous{$t});
} elsif (defined($call_graph{$t})) {
$resolved{$t} = 1;
} else {
$unresolved{$t} = 1;
}
}
$call_graph{$from} = \%resolved;
}
# Create fake edges and nodes to account for dynamic behaviour.
$call_graph{"INTERRUPT"} = {};
foreach (keys %call_graph) {
$call_graph{"INTERRUPT"}->{$_} = 1 if /^__vector_/;
}
# Trace the call graph and calculate, for each function:
#
# - inherited frames: maximum inherited frame of callees, plus own
# frame size.
# - height: maximum height of callees, plus one.
# - recursion: is the function called recursively (including indirect
# recursion)?
my %has_caller;
my %visited;
my %total_cost;
my %call_depth;
sub trace {
my $f = shift;
if ($visited{$f}) {
$visited{$f} = "R" if $visited{$f} eq "?";
return;
}
$visited{$f} = "?";
my $max_depth = 0;
my $max_frame = 0;
my $targets = $call_graph{$f} || die "Unknown function: $f";
if (defined($targets)) {
foreach (keys %$targets) {
my $t = $_;
$has_caller{$t} = 1;
trace($t);
my $is = $total_cost{$t};
my $d = $call_depth{$t};
$max_frame = $is if $is > $max_frame;
$max_depth = $d if $d > $max_depth;
}
}
$call_depth{$f} = $max_depth + 1;
$total_cost{$f} = $max_frame + ($frame_size{$f} || 0);
$visited{$f} = " " if $visited{$f} eq "?";
}
foreach (keys %call_graph) { trace $_; }
# Now, print results in a nice table.
printf " %-30s %8s %8s %8s\n",
"Func", "Cost", "Frame", "Height";
print "------------------------------------";
print "------------------------------------\n";
my $max_iv = 0;
my $main = 0;
foreach (sort { $total_cost{$b} <=> $total_cost{$a} } keys %visited) {
my $name = $_;
if (/^(.*)#(.*)$/) {
$name = $1 unless $ambiguous{$name};
}
my $tag = $visited{$_};
my $cost = $total_cost{$_};
$name = $_ if $ambiguous{$name};
$tag = ">" unless $has_caller{$_};
if (/^__vector_/) {
$max_iv = $cost if $cost > $max_iv;
} elsif (/^main#/) {
$main = $cost;
}
if ($ambiguous{$name}) { $name = $_; }
printf "%s %-30s %8d %8d %8d\n", $tag, $name, $cost,
$frame_size{$_} || 0, $call_depth{$_};
}
print "\n";
print "Peak execution estimate (main + worst-case IV):\n";
printf " main = %d, worst IV = %d, total = %d\n",
$total_cost{$global_name{"main"}},
$total_cost{"INTERRUPT"},
$total_cost{$global_name{"main"}} + $total_cost{"INTERRUPT"};
print "\n";
print "The following functions were not resolved:\n";
foreach (keys %unresolved) { print " $_\n"; }
Edit2:
As Amon suggested to check, subsequent iterations of the script on the same dataset doesn't produce the same output. Values (cost/frame/height) are always the same but the order in which the functions are reported is different.
Trying to understand a piece of code in Perl as follows
my #a = qw(A B C D E F F A K O N D);
my #b = qw(S F S T F F S);
$s=sprintf "%s %s %d %d:%d:%d %d: %s" , sub { ($a[$_[6]], $b[$_[4]], $_[3], $_[2], $_[1], $_[0], $_[5]+1900) }->(localtime), $s;
This code is in a method in a package which has been called with three arguments, arrays a and b have been defined in the method above in this code and s is a defined string.
All I can understand is that it is an anonymous subroutine which is expected to return multiple values due to localtime function but I just cannot understand the mechanism that this code follows. I assume I have given enough information to ask this question if not let me know.
The syntax is an unusual way of encapsulating the modification of several values at once inside of one statement.
This code is probably used at the end of a subroutine, as it returns a list of values. The real code might look like this.
sub foo {
my #a = qw(Sun Mon Tue Wed Thu Fri Sat);
my #b = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my $s = q{foobar};
# wday month mday hour min sec year
return sub { ( $a[ $_[6] ], $b[ $_[4] ], $_[3], $_[2], $_[1], $_[0], $_[5] + 1900 ) }
->(localtime), $s;
}
Let's go through that. I added the return to make it more clear what's happening.
There is an anonymous subroutine sub { ... }, which is defined and directly being called sub { ... }->(). This is like a lambda function. The arrow operator -> in this form is used for calling a code reference. It differs from the direct object syntax (e.g. $foo->frobnicate()) in that there is no method name after the arrow.
my $code = sub { print "#_" };
$code->(1, 2, 3);
This will output:
1, 2, 3
The localtime return value in list context is passed. That's a bunch of values for the current time, as mentioned in the documentation.
# 0 1 2 3 4 5 6 7 8
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
localtime(time);
Without arguments, localtime implicitly uses time as its argument. Inside of the anonymous subroutine, those values end up in #_, which is the argument list of the sub. So $_[0] is the seconds part of the current local date.
Then there is #a and #b, which are really badly chosen names. If it's supposed to be short, I'd at least have called them #d and #m for weekdays and months.
The anonymous sub returns a list of values. It's the current weekday, as looked up in the #a array, where Sunday is index 0, the current month as looked in #b, the day, hour, minute and second, and the year as a four-digit number (hence the + 1900).
The overall function foo returns this list, and another value $s as the last element of its return value list.
If you were to print all of that, you would get:
print join q{ }, foo();
__END__
Mon Sep 25 16 10 33 2017 foobar
In the context of the sprintf, this becomes more clear. It's likely for adding a timestamp to a log entry.
sub log {
my $s = shift;
my #a = qw(Sun Mon Tue Wed Thu Fri Sat);
my #b = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
# 0 1 2 3 4 5 6 7
$s = sprintf "%s %s %d %d:%d:%d %d: %s",
# wday month mday hour min sec year
# 0 1 2 3 4 5 6
sub { ( $a[ $_[6] ], $b[ $_[4] ], $_[3], $_[2], $_[1], $_[0], $_[5] + 1900 ) }
# 7
->(localtime), $s;
return $s;
}
print &log('stuff happened'); # we need the & because CORE::log
I've marked each value with the position of the argument inside the sprintf pattern. $s seems to be the log message, which is replaced by the new string with the timestamp, and itself as the last argument.
However, this is a very unusual implementation, given that localtime will return the exact same thing in scalar context.
print &log("stuff happened\n");
print ''.localtime . ": stuff happened\n";
__END__
Mon Sep 25 16:24:40 2017: stuff happened
Mon Sep 25 16:24:40 2017: stuff happened
It's the return value of ctime(3). I first thought maybe it's a reimplementation because on the target system that kind of timestamp does not exist, but the localtime docs also say that this is not system-dependent.
The format of this scalar value is not locale-dependent but built into
Perl. For GMT instead of local time use the gmtime builtin. See also
the Time::Local module (for converting seconds, minutes, hours, and
such back to the integer value returned by time), and the POSIX
module's strftime and mktime functions.
The implementation with the sub is elegant, though not very Perl-ish. However, it's completely useless because Perl already brings this exact same format.
You could easily replace my log function with this.
sub log {
my $s = shift;
$s = localtime . " $s";
return $s;
}
But then again, if #a and #b are maybe localized, this makes makes sense. You give them as letter, which is probably not your real code, but I could see this being used as for example translating it to German like this.
use utf8;
sub log {
my $s = shift;
my #a = qw(So Mo Di Mi Do Fr Sa);
my #b = qw(Jan Feb Mär Apr Mai Jun Jul Aug Sep Okt Nov Dez);
# 0 1 2 3 4 5 6 7
$s = sprintf "%s %s %d %d:%d:%d %d: %s",
# wday month mday hour min sec year
# 0 1 2 3 4 5 6
sub { ( $a[ $_[6] ], $b[ $_[4] ], $_[3], $_[2], $_[1], $_[0], $_[5] + 1900 ) }
# 7
->(localtime), $s;
return $s;
}
print &log("stuff happened\n");
Now it makes a lot more sense.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I think it won't be a straightforward thing to solve, but here it is. I have a file in the following format:
"2004-04-19 12:25:00" 44 44
"2004-04-19 12:26:30" 36 36
"2004-04-19 12:27:15" 34 34
and I need a file with exact same content, except that the first row of the first column is 0, and remaining values of the first column are times in minutes since time 0 (since first time). Times should be rounded 3 digit after decimal (dot), like this:
0.000 44 44
1.500 36 36
2.250 34 34
I am as disappointed as the others who downvoted your question with your reluctance to even attempt to do this for yourself before asking for free help. If you always do this then you will never learn the language and will always have to rely on other people to do it for you. However, I would prefer that you got your solution from Stack Overflow, as I am sure you will only go elsewhere and ask the same question
This will do as you ask. It expects the path to the input file as a parameter on the command line, and sends the output to STDOUT
To write the output to a file you would redirect it on the command line, like this
$perl fractional_minutes.pl mydata.txt > mydata_new.txt
and if you want to write back to the same file then you will need to use Perl's in-place option, like this
$perl -i.bak fractional_minutes.pl mydata.txt
which will save the old file as mydata.txt.bak
use strict;
use warnings;
use Time::Piece;
use Time::Seconds qw/ ONE_MINUTE /;
my $t0;
while ( <> ) {
s{^"([^"]*)"}{
my $t = Time::Piece->strptime($1, '%Y-%m-%d %H:%M:%S');
$t0 = $t if not defined $t0;
sprintf('%.3f', ($t - $t0) / ONE_MINUTE);
}e;
print;
}
output
0.000 44 44
1.500 36 36
2.250 34 34
With GNU awk for mktime() and gensub():
$ cat tst.awk
BEGIN { FS="\"" }
{ mins=mktime(gensub(/[-:]/," ","g",$2)) / 60 }
NR==1 { startMins=mins }
{ printf "%.3f%s\n", (mins-startMins), $3 }
$ awk -f tst.awk file
0.000 44 44
1.500 36 36
2.250 34 34
Here is a solution that uses Time::Moment:
#!/usr/bin/perl
use strict;
use warnings;
use Time::Moment;
my #times = map {
Time::Moment->from_string($_ . 'Z', lenient => 1)
} ('2004-04-19 12:25:00',
'2004-04-19 12:26:30',
'2004-04-19 12:27:15');
my $first = $times[0];
foreach my $time (#times) {
printf "%.3f\n", ($time->rd - $first->rd) / (1/60/24);
}
Output:
0.000
1.500
2.250
I'm a perl rookie and dont know how to do this...
My input file:
random text 00:02 23
random text 00:04 25
random text 00:06 53
random text 00:07 56
random text 00:12 34
... etc until 23:59
I would like to have the following output:
00:00
00:01
00:02 23
00:03
00:04
00:05
00:06 53
00:07 56
00:08
00:09
00:10
00:11
00:12 34
... etc until 23:59
So an output file with a every minute timestamp and the corresponding value if found in input file. My input file starts at 00:00 and ends 23:59
My code sofar:
use warnings;
use strict;
my $found;
my #event;
my $count2;
open (FILE, '<./input/input.txt');
open (OUTPUT, '>./output/output.txt');
while (<FILE>){
for ($count2=0; $count2<60; $count2++){
my($line) = $_;
if($line =~ m|.*(00:$count2).*|){
$found = "$1 \n";
push #event, $found;
}
if (#event){
}
else {
$found2 = "00:$count2,";
push #event, $found2;
}
}
}
print OUTPUT (#event);
close (FILE);
close (OUTPUT);
Here's one approach to your task:
use strict;
use warnings;
my %hash;
open my $inFH, '<', './input/input.txt' or die $!;
while (<$inFH>) {
my ( $hr_min, $sec ) = /(\d\d:\d\d)\s+(.+)$/;
push #{ $hash{$hr_min} }, $sec;
}
close $inFH;
open my $outFH, '>', './output/output.txt' or die $!;
for my $hr ( 0 .. 23 ) {
for my $min ( 0 .. 59 ) {
my $hr_min = sprintf "%02d:%02d", $hr, $min;
my $sec = defined $hash{$hr_min} ? " ${ $hash{$hr_min} }[-1]" : '';
print $outFH "$hr_min$sec\n";
}
}
close $outFH;
The first part reads your input file and uses a regex to grab the time at the end of each string. A hash of arrays (HoA) is built, with the HH:MM as the key and seconds in the array. For example:
09:14 => ['21','45']
This means that at 09:14 there were two second entires: one at 21 seconds and the other at 45 seconds. Since the times in the input file are in ascending order, the highest one in the array can be obtained by using the [-1] subscript.
Next, two loops are set up: the outer is (0..23) and the inner (0..59), and sprintf is used to format the HH:MM. When a key is found in the hash that corresponds to the current HH:MM in the loops, HH:MM and the last item in the array (the largest seconds) is printed out to a file (e.g., 00:02 23). If there isn't a corresponding HH:MM in the hash, just the loop's HH:MM is printed (e.g., 00:03):
Sample output:
00:00
00:01
00:02 23
00:03
00:04 45
00:05
00:06 53
00:07 59
00:08
00:09
00:10
00:11
00:12 34
...
23:59
Hope this helps!
This is best done with a hash, as Kenosis has already shown. There are some simplifications/improvements that can be done, however.
By using assignment = we store the latest value for each time, because identical hash keys will overwrite each other.
The range operator .. can also increment strings, so that we can get a range of strings, like 00, 01, ... 59.
The defined-or operator // can be used as a more concise way to check if a key for a certain time is defined.
Using \d+ rather than .+ will be much safer, as it will prevent something like hindsight is 20:20 at 01:23 45 to match 20:20 incorrectly.
We do not use hardcoded file names, instead using shell redirection and arguments.
In the below example code, I used a smaller range of numbers for demonstration purposes. I also used the DATA file handle so that this code can be copy/pasted and tried out. To try it, change <DATA> to <> and run it like this:
perl script.pl input.txt > output.txt
Code:
use strict;
use warnings;
use feature 'say';
my %t;
while (<DATA>) {
if (/((\d{2}:\d{2})\s+\d+)$/) {
$t{$2} = $1; # store most recent value
}
}
for my $h ('00' .. '00') {
for my $m ('00' .. '12') {
my $time = "$h:$m";
say $t{$time} // $time; # say defined $t{$time} ? $t{$time} : $time;
}
}
__DATA__
random text 00:02 23
random text 00:04 25
random text 00:06 53
random text 00:07 56
random text 00:12 34
random text 00:12 39
Output:
00:00
00:01
00:02 23
00:03
00:04 25
00:05
00:06 53
00:07 56
00:08
00:09
00:10
00:11
00:12 39
Need help in merging/concatenating /combining /binding etc
I have several ascii files each defining one variable which I have converted to a single column array
I have such columnised data for many variables ,so I need to perform a column bind like R does and make it one single file.
I can do the same in R but there are too many files. Being able to do it with one single code will help save a lot of time.
Using the following code ,new to perl and need help with this.
#filenames = ("file1.txt","file2.txt");
open F2, ">file_combined.txt" or die;
for($j = 0; $j< scalar #filenames;$j++){
open F1, $filenames[$j] or die;
for($i=1;$i<=6;$i++){$line=<F1>;}
while($line=<F1>){
chomp $line;
#spl = split '\s+', $line;
for($i=0;$i<scalar #spl;$i++){
print F2 "$spl[$i]\n";
paste "file_bio1.txt","file_bio2.txt"> file_combined.txt;
}
}
close F1;
}
Input files here are Ascii text files of a raster.They look like this
32 12 34 21 32 21 22 23
12 21 32 43 21 32 21 12
The above mentioned code without the paste syntax converts these files into a single column
32
12
34
21
32
21
22
23
12
21
32
43
21
32
21
12
The output should look like this
12 21 32
32 23 23
32 21 32
12 34 12
43 32 32
32 23 23
32 34 21
21 32 23
Each column represents a different ascii file.
I need around 15 such ascii files into one dataframe.I can do the same in R but it consumes a lot of time as the number of files and regions of interest are too many and the files are a bit large too.
Let's step through what you have...
# files you want to open for reading..
#filenames = ("file1.txt","file2.txt");
# I would use the 3 arg lexical scoped open
# I think you want to open this for 'append' as well
# open($fh, ">>", "file_combined.txt") or die "cannot open";
open F2, ">file_combined.txt" or die;
# #filenames is best thought as a 'list'
# for my $file (#filenames) {
for($j = 0; $j< scalar #filenames;$j++){
# see above example of 'open'
# - $filenames[$j] + $file
open F1, $filenames[$j] or die;
# what are you trying to do here? You're overriding
# $line in the next 'while loop'
for($i=1;$i<=6;$i++){$line=<F1>;}
# while(<$fh1>) {
while($line=<F1>){
chomp $line;
# #spl is short for split?
# give '#spl' list a meaningful name
#spl = split '\s+', $line;
# again, #spl is a list...
# for my $word (#spl) {
for($i=0;$i<scalar #spl;$i++){
# this whole block is a bit confusing.
# 'F2' is 'file_combined.txt'. Then you try and merge
# ( and overwrite the file) with the paste afterwards...
print F2 "$spl[$i]\n";
# is this a 'system call'?
# Missing 'backticks' or 'system'
paste "file_bio1.txt","file_bio2.txt"> file_combined.txt;
}
}
# close $fh1
close F1;
}
# I'm assuming there's a 'close F2' somewhere here..
It looks like you're trying to do this:
#filenames = ("file1.txt","file2.txt");
$oufile = "combined_text.txt";
`paste $filenames[0] $filenames[1] > $outfile`;