How do I change this to "idiomatic" Perl? - perl

I am beginning to delve deeper into Perl, but am having trouble writing "Perl-ly" code instead of writing C in Perl. How can I change the following code to use more Perl idioms, and how should I go about learning the idioms?
Just an explanation of what it is doing: This routine is part of a module that aligns DNA or amino acid sequences(using Needelman-Wunch if you care about such things). It creates two 2d arrays, one to store a score for each position in the two sequences, and one to keep track of the path so the highest-scoring alignment can be recreated later. It works fine, but I know I am not doing things very concisely and clearly.
edit: This was for an assignment. I completed it, but want to clean up my code a bit. The details on implementing the algorithm can be found on the class website if any of you are interested.
sub create_matrix {
my $self = shift;
#empty array reference
my $matrix = $self->{score_matrix};
#empty array ref
my $path_matrix = $self->{path_matrix};
#$seq1 and $seq2 are strings set previously
my $num_of_rows = length($self->{seq1}) + 1;
my $num_of_columns = length($self->{seq2}) + 1;
#create the 2d array of scores
for (my $i = 0; $i < $num_of_rows; $i++) {
push(#$matrix, []);
push(#$path_matrix, []);
$$matrix[$i][0] = $i * $self->{gap_cost};
$$path_matrix[$i][0] = 1;
}
#fill out the first row
for (my $i = 0; $i < $num_of_columns; $i++) {
$$matrix[0][$i] = $i * $self->{gap_cost};
$$path_matrix[0][$i] = -1;
}
#flag to signal end of traceback
$$path_matrix[0][0] = 2;
#double for loop to fill out each row
for (my $row = 1; $row < $num_of_rows; $row++) {
for (my $column = 1; $column < $num_of_columns; $column++) {
my $seq1_gap = $$matrix[$row-1][$column] + $self->{gap_cost};
my $seq2_gap = $$matrix[$row][$column-1] + $self->{gap_cost};
my $match_mismatch = $$matrix[$row-1][$column-1] + $self->get_match_score(substr($self->{seq1}, $row-1, 1), substr($self->{seq2}, $column-1, 1));
$$matrix[$row][$column] = max($seq1_gap, $seq2_gap, $match_mismatch);
#set the path matrix
#if it was a gap in seq1, -1, if was a (mis)match 0 if was a gap in seq2 1
if ($$matrix[$row][$column] == $seq1_gap) {
$$path_matrix[$row][$column] = -1;
}
elsif ($$matrix[$row][$column] == $match_mismatch) {
$$path_matrix[$row][$column] = 0;
}
elsif ($$matrix[$row][$column] == $seq2_gap) {
$$path_matrix[$row][$column] = 1;
}
}
}
}

You're getting several suggestions regarding syntax, but I would also suggest a more modular approach, if for no other reason that code readability. It's much easier to come up to speed on code if you can perceive the big picture before worrying about low-level details.
Your primary method might look like this.
sub create_matrix {
my $self = shift;
$self->create_2d_array_of_scores;
$self->fill_out_first_row;
$self->fill_out_other_rows;
}
And you would also have several smaller methods like this:
n_of_rows
n_of_cols
create_2d_array_of_scores
fill_out_first_row
fill_out_other_rows
And you might take it even further by defining even smaller methods -- getters, setters, and so forth. At that point, your middle-level methods like create_2d_array_of_scores would not directly touch the underlying data structure at all.
sub matrix { shift->{score_matrix} }
sub gap_cost { shift->{gap_cost} }
sub set_matrix_value {
my ($self, $r, $c, $val) = #_;
$self->matrix->[$r][$c] = $val;
}
# Etc.

One simple change is to use for loops like this:
for my $i (0 .. $num_of_rows){
# Do stuff.
}
For more info, see the Perl documentation on foreach loops and the range operator.

I have some other comments as well, but here is the first observation:
my $num_of_rows = length($self->{seq1}) + 1;
my $num_of_columns = length($self->{seq2}) + 1;
So $self->{seq1} and $self->{seq2} are strings and you keep accessing individual elements using substr. I would prefer to store them as arrays of characters:
$self->{seq1} = [ split //, $seq1 ];
Here is how I would have written it:
sub create_matrix {
my $self = shift;
my $matrix = $self->{score_matrix};
my $path_matrix = $self->{path_matrix};
my $rows = #{ $self->{seq1} };
my $cols = #{ $self->{seq2} };
for my $row (0 .. $rows) {
$matrix->[$row]->[0] = $row * $self->{gap_cost};
$path_matrix->[$row]->[0] = 1;
}
my $gap_cost = $self->{gap_cost};
$matrix->[0] = [ map { $_ * $gap_cost } 0 .. $cols ];
$path_matrix->[0] = [ (-1) x ($cols + 1) ];
$path_matrix->[0]->[0] = 2;
for my $row (1 .. $rows) {
for my $col (1 .. $cols) {
my $gap1 = $matrix->[$row - 1]->[$col] + $gap_cost;
my $gap2 = $matrix->[$row]->[$col - 1] + $gap_cost;
my $match_mismatch =
$matrix->[$row - 1]->[$col - 1] +
$self->get_match_score(
$self->{seq1}->[$row - 1],
$self->{seq2}->[$col - 1]
);
my $max = $matrix->[$row]->[$col] =
max($gap1, $gap2, $match_mismatch);
$path_matrix->[$row]->[$col] = $max == $gap1
? -1
: $max == $gap2
? 1
: 0;
}
}
}

Instead of dereferencing your two-dimensional arrays like this:
$$path_matrix[0][0] = 2;
do this:
$path_matrix->[0][0] = 2;
Also, you're doing a lot of if/then/else statements to match against particular subsequences: this could be better written as given statements (perl5.10's equivalent of C's switch). Read about it at perldoc perlsyn:
given ($matrix->[$row][$column])
{
when ($seq1_gap) { $path_matrix->[$row][$column] = -1; }
when ($match_mismatch) { $path_matrix->[$row][$column] = 0; }
when ($seq2_gap) { $path_matrix->[$row][$column] = 1; }
}

The majority of your code is manipulating 2D arrays. I think the biggest improvement would be switching to using PDL if you want to do much stuff with arrays, particularly if efficiency is a concern. It's a Perl module which provides excellent array support. The underlying routines are implemented in C for efficiency so it's fast too.

I would always advise to look at CPAN for previous solutions or examples of how to do things in Perl. Have you looked at Algorithm::NeedlemanWunsch?
The documentation to this module includes an example for matching DNA sequences. Here is an example using the similarity matrix from wikipedia.
#!/usr/bin/perl -w
use strict;
use warnings;
use Inline::Files; #multiple virtual files inside code
use Algorithm::NeedlemanWunsch; # refer CPAN - good style guide
# Read DNA sequences
my #a = read_DNA_seq("DNA_SEQ_A");
my #b = read_DNA_seq("DNA_SEQ_B");
# Read Similarity Matrix (held as a Hash of Hashes)
my %SM = read_Sim_Matrix();
# Define scoring based on "Similarity Matrix" %SM
sub score_sub {
if ( !#_ ) {
return -3; # gap penalty same as wikipedia)
}
return $SM{ $_[0] }{ $_[1] }; # Similarity Value matrix
}
my $matcher = Algorithm::NeedlemanWunsch->new( \&score_sub, -3 );
my $score = $matcher->align( \#a, \#b, { align => \&check_align, } );
print "\nThe maximum score is $score\n";
sub check_align {
my ( $i, $j ) = #_; # #a[i], #b[j]
print "seqA pos: $i, seqB pos: $j\t base \'$a[$i]\'\n";
}
sub read_DNA_seq {
my $source = shift;
my #data;
while (<$source>) {
push #data, /[ACGT-]{1}/g;
}
return #data;
}
sub read_Sim_Matrix {
#Read DNA similarity matrix (scores per Wikipedia)
my ( #AoA, %HoH );
while (<SIMILARITY_MATRIX>) {
push #AoA, [/(\S+)+/g];
}
for ( my $row = 1 ; $row < 5 ; $row++ ) {
for ( my $col = 1 ; $col < 5 ; $col++ ) {
$HoH{ $AoA[0][$col] }{ $AoA[$row][0] } = $AoA[$row][$col];
}
}
return %HoH;
}
__DNA_SEQ_A__
A T G T A G T G T A T A G T
A C A T G C A
__DNA_SEQ_B__
A T G T A G T A C A T G C A
__SIMILARITY_MATRIX__
- A G C T
A 10 -1 -3 -4
G -1 7 -5 -3
C -3 -5 9 0
T -4 -3 0 8
And here is some sample output:
seqA pos: 7, seqB pos: 2 base 'G'
seqA pos: 6, seqB pos: 1 base 'T'
seqA pos: 4, seqB pos: 0 base 'A'
The maximum score is 100

Related

Best data structure for a sparse ordered 2D array of floats allowing interpolation (perl)

The data are stock options. I want to make a 2D array based on days till expiration (int) & normalized distance out of the money (float), with the values being a list of normalized bid and ask prices. If the desired element is not in the array, I want to be able to interpolate between the nearest elements present.
I see 3 possible data structures:
A sparse 2D array, maybe 10000 elements, maybe 1/3 full.
A 2D linked list, ie: 4 listpointers for each data element (so 3000 elements becomes 15000)
A 2D hash (maybe 3000 elements), with 2 sorted lists of the keys (maybe 100 elements each) in each dimension.
The main problem is efficient retrieval when interpolation is required.
Retrieval of existing elements is relatively straight-forward with any method.
I'm currently using choice 3, but retrieval is a bit of a kloodge, since I have to scan along the keylists of each dimension till I find occupied elements, and then do a 2- or 4-way interpolation.
I use moreUtils::firstindx($_ > $desiredKey) to find the keys. The linked lists (choice 2) would spare me the search of the keylist arrays.
Choice 1 would also require scanning, wouldn't need the initial step of keylist lookup, but might need to look at more empty cells. And insertion would be a real hassle.
I would be doing many more searches than insertions.
Does any one have any suggestions for the most efficient data structure.
Since you predominantly perform lookups by lifespan and lookups by distance, and few inserts, I'd use sorted arrays to lookup the records by binary search.
Locating an existing element: O(log N)
Locating the box of a missing element: O(log N)
Inserting: O(N)
Given,
my #data = (
[ $lifespan0, $distance0, $bid0, $ask0 ],
[ $lifespan1, $distance1, $bid1, $ask1 ],
...
);
my $lifespan_search_cmp = sub { $a <=> $data[$b][0] };
my $distance_search_cmp = sub { $a <=> $data[$b][1] };
First, create indexes:
my #by_lifespan = sort { $data[$a][0] <=> $data[$b][0] } 0..$#data;
my #by_distance = sort { $data[$a][1] <=> $data[$b][1] } 0..$#data;
To lookup:
my $i = binsearch_first \&$lifespan_search_cmp, $lifespan, #by_lifespan;
my $j = binsearch_first \&$distance_search_cmp, $distance, #by_distance;
my #lifespan_matching_idxs = get_run_forward \&$lifespan_search_cmp, $lifespan, $i, #by_lifespan;
my #distance_matching_idxs = get_run_forward \&$distance_search_cmp, $distance, $j, #by_distance;
my #cross_match_idxs = do {
my %lifespan_matching_idxs = map { $_ => 1 } #lifespan_matching_idxs;
grep { $lifespan_matching_idxs{$_} }
#distance_matching_idxs
};
if (#cross_match_idxs) {
# Exact match(es) found.
...
} else {
my $lifespan_lowerbracket;
my $lifespan_upperbracket;
if ($i >= 0) {
$lifespan_lowerbracket = $lifespan;
$lifespan_upperbracket = $lifespan;
} else {
die "Can't interpolate" if ~$i == 0 || ~$i >= #by_lifespan;
$lifespan_lowerbracket = $data[~$i ][0];
$lifespan_lowerbracket = $data[~$i - 1][0];
}
my $distance_lowerbracket;
my $distance_upperbracket;
if ($i >= 0) {
$distance_lowerbracket = $distance;
$distance_upperbracket = $distance;
} else {
die "Can't interpolate" if ~$j == 0 || ~$j >= #by_distance;
$distance_lowerbracket = $data[~$j ][1];
$distance_upperbracket = $data[~$j - 1][1];
}
...
}
To insert:
my $i = binsearch_first \&$lifespan_search_cmp, $lifespan, #by_lifespan;
my $j = binsearch_first \&$distance_search_cmp, $distance, #by_distance;
push #data, [ $lifespan, $distance , $bid, $ask ];
splice(#by_lifespan, $i >= 0 ? $i : ~$i, 0, $#data);
splice(#by_distance, $j >= 0 ? $j : ~$j, 0, $#data);
Subs:
sub binsearch_first(&$\#) {
my $compare = $_[0];
#my $value = $_[1];
my $array = $_[2];
my $min = 0;
my $max = $#$array;
return -1 if $max == -1;
my $ap = do { no strict 'refs'; \*{caller().'::a'} }; local *$ap;
my $bp = do { no strict 'refs'; \*{caller().'::b'} }; local *$bp;
*$ap = \($_[1]);
while ($min <= $max) {
my $mid = int(($min+$max)/2);
*$bp = \($array->[$mid]);
my $cmp = $compare->();
if ($cmp < 0) {
$max = $mid - 1;
}
elsif ($cmp > 0) {
$min = $mid + 1;
}
else {
return $mid if $mid == $min;
$max = $mid;
}
}
# Converts unsigned int to signed int.
return unpack('j', pack('J', ~$min));
}
sub get_run_forward(&$\#) {
my $compare = $_[0];
#my $value = $_[1];
my $start = $_[2];
my $array = $_[3];
return if $start < 0;
my $ap = do { no strict 'refs'; \*{caller().'::a'} }; local *$ap;
my $bp = do { no strict 'refs'; \*{caller().'::b'} }; local *$bp;
*$ap = \($_[1]);
my $i = $start;
while ($i <= $#$array) {
*$bp = \($array->[$i]);
my $cmp = $compare->()
and last;
++$i;
}
return wantarray ? ($start..$i-1) : $i-1;
}
You might want to use a tolerance in the floating-point comparions (i.e. in $distance_search_cmp).

Perl: Dice roll "fix" to translate 2-8, 5-8, to 2d4, 1d4+3

I am trying to do some importing of data that uses different dice roll text. For example you'll have 2-4, 2-12, 5-8, 2-5. I am trying to write a translator that will convert them to proper dice rolls. What I have so far will translate it but I end up with some odd rolls. For example on the 5-8 it comes up with 5d1+3. With other things like 2-8 it will give me a reasonable 2d4. I'm trying to figure out how to make it ALWAYS use "real" dice.
Currently I am using this perl script. You simply pass it the dice string you want to fix and it returns what it thinks is the right roll. My problem is I can't think of how to limit the dice "sides" to only these 2,3,4,6,8,12,20 sides.
# turn 2-7 into 1d6+1 or 2-8 into 2d4
sub fix_DiceRolls {
my($diceRoll) = #_;
use POSIX;
if ($diceRoll =~ /^(\d+)(\-)(\d+)\b/i) {
# 2-5
#Group 1. 0-1 `2`
#Group 2. 1-2 `-`
#Group 3. 2-3 `5`
my($count) = $1;
my($size) = $3;
if ($count == 1) {
$diceRoll = "$1d$3";
} else {
my ($newSize) = $size/$count;
my ($remainder) = $size % $count;
my ($round_remainder) = ceil($remainder);
my ($round_newSize) = floor($newSize);
if ($remainder == 0) {
$diceRoll = $count."d".$newSize;
} else {
$diceRoll = $count."d".$round_newSize."+".$round_remainder;
}
}
}
return $diceRoll;
}
The following might help you. It doesn't know how to do 1d6/2, but it correctly translates 4-19 to 3d6+1 | 5d4-1.
#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };
my #valid_dice = (4, 6, 8, 10, 12, 20);
sub fix_dice_roll {
my ($dice_roll) = #_;
my ($from, $to) = split /-/, $dice_roll;
my $range = $to - $from + 1;
my #translations;
for my $dice (#valid_dice) {
if (0 == ($range - 1) % ($dice - 1)) {
my $times = ($range - 1) / ($dice - 1);
my $plus = sprintf '%+d', $to - $times * $dice;
$plus = q() if '+0' eq $plus;
push #translations, [ $dice, $times, $plus ];
}
}
#translations = sort { $a->[1] <=> $b->[1] } #translations;
return map "$_->[1]d$_->[0]$_->[2]", #translations;
}
say join ' | ', fix_dice_roll($_) while <>;

For loop help in perl

I am writing perl script and I have little question regarding for loop limit.
Let say I have two arrays, arr1 has serial numbers and arr2 is two dimensional array, the first dimension is the serial number [same as arr1] and the second dimension is the contents of that serial number , Now I want to apply the for loop for this two dimension array but I am confused at the limit . Till now I have this code
Example : I have Three serial numbers , 1 ,2 ,3 . Serial 1 has 2 contents 1,5 . Serial 2 has 1 content i.e 1. Serial 3 has two contents 1,1.
#arr1 = (1,2,3)
$arr2[0][0] = 1
$arr2[0][1] = 5
$arr2[1][0] = 1
$arr2[2][1] = 1
$arr2[2][2] = 1
Note: As you can see the contents of arr2 has arr1 elements in 1st columns and the contents in the second columns.
for (my $i = 0; $i <= $#arr1; $i++) {
print( "The First Serial number has:" );
for (my $j = 0; $j <= $#arr2; $j++) {
print( "$arr2[$i][$j]\n" );
}
}
Thanks, Sorry for the bad explaination
Why don't do this like that :
#!/usr/bin/perl
use strict;
my #arr;
$arr[0][0] = 1;
$arr[0][1] = 5;
$arr[1][0] = 1;
$arr[2][1] = 1;
$arr[2][2] = 1;
my ($i, $j);
foreach $i (#arr) {
foreach $j (#{$i}) {
print $j."\n" if($j);
}
}
1;
__END__
Fixed code:
use strict;
use warnings;
my #arr1 = (1,2,3);
my #arr2;
$arr2[0][0] = 1;
$arr2[0][1] = 5;
$arr2[1][0] = 1;
$arr2[2][0] = 1; # original code had
$arr2[2][1] = 1; # these indexes wrong
for (my $i = 0; $i <= $#arr1; $i++) {
print( "Serial number $arr1[$i] has:" );
for (my $j = 0; $j <= $#{ $arr2[$i] }; $j++) {
print( "$arr2[$i][$j]\n" );
}
}
Note the use of $#{ arrayref }; see http://perlmonks.org/?node=References+quick+reference
you can put #arr2 like this and it would be much easier for you to understand #arr2
use strict;
use warnings;
my #arr1 = (1, 2, 3);
my #arr2 = ([1, 5], [1], [1, 1]);
for my $first(#arr1) {
for my $second (#{$arr2[$first-1]}) {
print $second."\n";
}
}
Here is a version without the first array.
for (my $i = 0; $i<= $#arr; $i++)
{
print "INDEX $i\n";
for (my $j = 0; $j <= $#{$arr[$i]}; $j++)
{
print "${arr[$i][$j]}\n";
}
}
The point here is that a two dimensional array is in fact an array of arrays (well actually array references, but that does not change anything here). So in the inner loop, you should check against the size of the array that is stored in $arr[$i].
Try this.
my #arr2;
$arr2[0][0] = 1;
$arr2[0][1] = 5;
$arr2[1][0] = 1;
$arr2[2][0] = 1;
$arr2[2][1] = 1;
foreach $inside_array (#arr2){
foreach $ele (#$inside_array){
print $ele,"\n";
}
}
Its always better to use foreach instead of for/while, this will eliminate any possibility of bugs. Especially with judging proper condition to exit the loop.

Using perl, given an array of any size, how do I randomly pick 1/4 of the list

For clarification, if I had a list of 8 elements, i would want to randomly pick 2. If I had a list of 20 elements, I would want to randomly pick 5. I would also like to assure (though not needed) that two elements don't touch, i.e. if possible not the 3 and then 4 element. Rather, 3 and 5 would be nicer.
The simplest solution:
Shuffle the list
select the 1st quarter.
Example implementation:
use List::Util qw/shuffle/;
my #nums = 1..20;
my #pick = (shuffle #nums)[0 .. 0.25 * $#nums];
say "#pick";
Example output: 10 2 18 3 19.
Your additional restriction “no neighboring numbers” actually makes this less random, and should be avoided if you want actual randomness. To avoid that two neighboring elements are included in the output, I would iteratively splice unwanted elements out of the list:
my #nums = 1..20;
my $size = 0.25 * #nums;
my #pick;
while (#pick < $size) {
my $i = int rand #nums;
push #pick, my $num = $nums[$i];
# check and remove neighbours
my $len = 1;
$len++ if $i < $#nums and $num + 1 == $nums[$i + 1];
$len++, $i-- if 0 < $i and $num - 1 == $nums[$i - 1];
splice #nums, $i, $len;
}
say "#pick";
use strict;
use warnings;
sub randsel {
my ($fact, $i, #r) = (1.0, 0);
while (#r * 4 < #_) {
if (not grep { $_ == $i } #r) {
$fact = 1.0;
# make $fact = 0.0 if you really don't want
# consecutive elements
$fact = 0.1 if grep { abs($i - $_) == 1 } #r;
push(#r, $i) if (rand() < 0.25 * $fact);
}
$i = ($i + 1) % #_;
}
return map { $_[$_] } sort { $a <=> $b } #r;
}
my #l;
$l[$_] = $_ for (0..19);
print join(" ", randsel(#l)), "\n";

Perl to count current value based on next value

Currently I'm learning Perl and gnuplot. I would like to know how to count certain value based on the next value. For example I have a text file consist of:
#ID(X) Y
1 1
3 9
5 11
The output should show the value of the unknown ID as well. So, the output should show:
#ID(X) Y
1 1
2 5
3 9
4 10
5 11
The Y of ID#2 is based on the following:
((2-3)/(1-3))*1 + ((2-1)/(3-1))*9 which is linear algebra
Y2=((X2-X3)/(X1-X3))*Y1 + ((X2-X1)/(X3-X1)) * Y3
Same goes to ID#5
Currently I have this code,
#! /usr/bin/perl -w
use strict;
my $prev_id = 0;
my $prev_val = 0;
my $next_id;
my $next_val;
while (<>)
{
my ($id, $val) = split;
for (my $i = $prev_id + 1; $i < $next_id; $i++)
{
$val = (($id - $next_id) / ($prev_id - $next_id)) * $prev_val + (($id - $prev_id) / ($next_id - $prev_id)) * $next_val;
printf ("%d %s\n", $i, $val);
}
printf ("%d %s\n", $id, $val);
($prev_val, $prev_id) = ($val, $id);
($next_val, $next_id) = ($prev_val, $prev_id);
}
Your formula seems more complicated than I would expect, given that you are always dealing with integer spacings of 1.
You did not say whether you want to fill gaps for multiple consecutive missing values, but let's assume you want to.
What you do is read in the first line, and say that's the current one and you output it. Now you read the next line, and if its ID is not the expected one, you fill the gaps with simple linear interpolation...
Pseudocode
(currID, currY) = readline()
outputvals( currID, currY )
while lines remain do
(nextID, nextY) = readline()
gap = nextID - currID
for i = 1 to gap
id = currID + i
y = currY + (nextY - currY) * i / gap
outputvals( id, y )
end
(currID, currY) = (nextID, nextY)
end
Sorry for the non-Perl code. It's just that I haven't been using Perl for ages, and can't remember half of the syntax. =) The concepts here are pretty easy to translate into code though.
Using an array may be the way to go. This will also make your data available for further manipulation.
** Caveat: will not work for multiple consecutive missing values of y; see #paddy's answer.
#!/usr/bin/perl
use strict;
use warnings;
my #coordinates;
while (<DATA>) {
my ($x, $y) = split;
$coordinates[$x] = $y;
}
# note that the for loop starts on index 1 here ...
for my $x (1 .. $#coordinates) {
if (! $coordinates[$x]) {
$coordinates[$x] = (($x - ($x + 1)) / (($x - 1) - ($x + 1)))
* $coordinates[$x - 1]
+ (($x - ($x - 1)) / (($x + 1) - ($x - 1)))
* $coordinates[$x + 1];
}
print "$x - $coordinates[$x]\n";
}
__DATA__
1 1
3 9
5 11
You indicated your problem is getting the next value. The key isn't to look ahead, it's to look behind.
my $prev = get first value;
my ($prev_a, $prev_b) = parse($prev);
my $this = get second value;
my ($this_a, $this_b) = parse($this);
while ($next = get next value) {
my ($next_a, $next_b) = parse($next);
...
$prev = $this; $prev_a = $this_a; $prev_b = $this_b;
$this = $next; $this_a = $next_a; $this_b = $next_b;
}
#! /usr/bin/perl -w
use strict;
my #in = (1,9,11);
my #out;
for (my $i = 0; $i<$#in; $i++) {
my $j = $i*2;
my $X1 = $i;
my $X2 = $i+1;
my $X3 = $i+2;
my $Y1 = $in[$i];
my $Y3 = $in[$i+1];
my $Y2 = $Y1*(($X2-$X3)/($X1-$X3))
+ $Y3*(($X2-$X1)/($X3-$X1));
$out[$j] = $in[$i];
$out[$j+1] = $Y2;
}
$out[$#in*2] = $in[$#in];
print (join " ",#out);