How can I modify my program to print out Pascal's triangle? - perl

So first of all Pascal's Triangle looks like this:
The first row that you see is the zero-ith row.
That's nothing unusual
when you are a computer scientist.
Each term in Pascal's triangle can be predicted with a combination with the formula:
C(n, k) = n! / [k! * (n - k)!], where "n" is the row and "k" is any integer from zero to n.
So thus it follows that Pascal's triangle can be predicted with (n, k) combinations:
And that's what you are seeing in the figure above.
Pascal's triangle is basically binomial probability:
(H + T)^n # You flip a two sided coin "n" times and it lands on "heads" or "tails" and you collect the frequency of each in a set of coefficients, for n = 3, we get the expansion:
(H + T)^3 = 1(H^3) + 3(H^2)(T) + 3(H)(T^2) + 1(T^3), where those coefficients:
1, 3, 3, 1 are in row 3 of Pascal's triangle.
I defined a factorial (!), and a combination and was able to get the
coefficient numbers on any row of Pascal's triangle with some looping Perl code:
use strict;
use warnings;
# Note the first row is row 0.
print("\nWhich row of Pascal's triangle to display: ");
my $row = <STDIN>; # The row that you want to display # This is also n.
my $terms = $row + 1; # The number of terms is one more than the row number.
Pascal_Row($row); # Print the Pascal numbers for that row.
# Function displays the numbers for a row of Pascal's triangle.
#######################################################
sub Pascal_Row
{
my $row = shift; # Row is passed in.
for(my $k = 0; $k < $row + 1; $k++) # k alternates, but not the row which is n.
{
print(combination($row, $k), "\t") # Print each row.
}
print("\n"); # Print a newline after each time this function is called.
}
# This computes the factorial of a number.
###########################################
sub factorial
{
my $number = shift; # argument.
my $factorial_number = 1; # initalize the factorial.
for(my $i = 1; $i <= $number; $i++)
{
$factorial_number *= $i; # compute the factorial, by multiplying all terms up to and including number.
}
return $factorial_number; # Return the factorial number.
}
# Computes a matehmatical combination usually denoted as C(n, k)
# where n is the row number, and k is each item in a row of Pascal's traingle
sub combination
{
my($n, $k) = #_; # from input.
# This is the mathematical formula for a combination.
my $combination_number = factorial($n) / (factorial($k) * factorial($n - $k));
return $combination_number # And returning it.
}
If I run the code and ask for row 8 of Pascal's triangle I get this:
Which row of Pascal's triangle to display: 8
1 8 28 56 70 56 28 8 1
That's entirely true for row 8 of Pascal's triangle. If I were to loop this from row 0 to the row 8 of Pascal's triangle I would get all correct rows of Pascal's triangle, but it wouldn't look like a triangle (it would look more like a box), so how could I modify my code to adjust the indenting.
How do I decide how much to indent the first row if I want 8 rows of Pascal's triangle displayed? How can I make a "triangle"?

Left-aligned triangle:
my $MAX_VAL_SIZE = 5;
for my $n (0...$N) {
my #row;
for my $k (0..$n) {
push #row, C($n, $k);
}
say join " ", map sprintf("%*d", $MAX_VAL_SIZE, $_), #row;
}
Centered triangle:
sub center {
my ($n, $s) = #_;
my $pad_len = $n - length($s);
my $pad_len_l = int($pad_len/2);
my $pad_len_r = $pad_len - $pad_len_l;
return ( " " x $pad_len_l ) . $s . ( " " x $pad_len_r );
}
my $MAX_VAL_SIZE = 5;
for my $n (0...$N) {
my #row;
for my $k (0..$n) {
push #row, C($n, $k);
}
my $row = join " ", map center($MAX_VAL_SIZE, $_), #row;
say center(($N+1)*($MAX_VAL_SIZE+2)-2, $row);
}

This is tricky since the varying width of numbers matters for layout.
Each row need be indented by half the separation between numbers in the row, suitably multiplied (zero for last row, rows-1 for first) -- that is, if numbers themselves were all of equal width.
But this isn't the case, except for the first few rows; the numbers take varying amount of space. One remedy is to use fixed-width for numbers and adjust indent and separation using that width.
All rows are computed first so that the maximum width of a number can be found.
use warnings;
use strict;
use feature 'say';
use List::Util qw(max);
my $max_row = (shift || 8);
my #rows = map { pascal_row($_) } 0..$max_row-1;
my $max_num_wd = max map { length } #{$rows[-1]};
my $pad = 1; # choice (must be non-zero)
my $sep = ' ' x ($max_num_wd + 2*$pad);
my $lead_sp = ' ' x ($max_num_wd + $pad);
for my $n (0..$#rows) {
say $lead_sp x ($max_row-1-$n),
join $sep, map { sprintf "%${max_num_wd}d", $_ } #{$rows[$n]};
}
sub pascal_row {
my ($row) = #_;
return [ map { n_over_k($row, $_) } 0..$row ];
}
sub n_over_k {
my ($n, $k) = #_;
return factorial($n) / (factorial($k) * factorial($n - $k));
}
sub factorial {
my ($n) = #_;
my $fact = 1;
$fact *= $_ for 2..$n;
return $fact;
}
This prints the correct layout. The $pad is an arbitrary integer for extra space over the maximum number width, for indent and separation; it must be >0 to coordinate them. (Separation needs that space both left and right of the centered number in the row above, thus the factor of 2.)
Original code, printing as it computes so $max_num_wd is set ahead by hand
# (includes and subs same as above except for List::Util)
my $max_row = (shift || 8);
my $max_num_wd = 4; # maximum width of numbers
my $pad = 1; # choice (non-zero)
my $sep = ' ' x ($max_num_wd + 2*$pad);
my $lead_sp = ' ' x ($max_num_wd + $pad);
for my $n (0..$max_row-1) {
my #row = #{ pascal_row($n) };
say $lead_sp x ($max_row-1-$n),
join $sep, map { sprintf "%${max_num_wd}d", $_ } #row;
}
This prints a correct layout with numbers up to 4-digits wide, or $max_num_wd need be adjusted.

Here is another way of doing it:
use strict;
use warnings;
sub fact {
my $n = shift;
return 1 if $n < 1;
return $n * fact($n - 1);
}
sub n_over_k {
my $n = shift;
my $k = shift;
return fact($n) / ( fact($k) * fact($n - $k) );
}
sub pascal_row {
my $n = shift;
return map { n_over_k($n - 1, $_) } (0 .. $n - 1);
}
my $n = shift || 8;
# $maxw is the first odd width where the biggest number will fit
my $max = 0;
map { $max = $_ if $_ > $max } pascal_row($n);
my $maxw = length('' . $max);
$maxw += ($maxw + 1) % 2;
# Print the Pascal´s triangle
foreach my $i (1..$n) {
print ' ' x ( ( $maxw + 1 ) * ($n - $i) / 2 );
foreach my $j ( pascal_row($i) ) {
printf "%${maxw}d ", $j;
}
print "\n";
}
How is it done? Fit each number within the first odd width where the max of the numbers to print will fit. That´s because numbers are separated with a space and that will make each width even (and so divisible by two for odd-valued triangle rows.) Then use printf to format the numbers. For instance %5d will right align the number within 5 characters. Precede each line except the last with the necessary spaces using ' ' x N which generates a string of N spaces.
Pascal's triangle of 8:
# pascal.pl 8
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
Pascal's triangle of 13:
# pascal.pl 13
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
1 10 45 120 210 252 210 120 45 10 1
1 11 55 165 330 462 462 330 165 55 11 1
1 12 66 220 495 792 924 792 495 220 66 12 1

You can generate the triangle without any combinatoric formulas.
The reason to do it this way is that this is the most effective method.
The basic idea is to employ the observation, that the value in the next
row is the sum of 2 elements located above.
This solution is also a good example of how to work with an array of
(references to) arrays.
An interesting feature is that the indent is computed from the
middle element in the last row (with the greatest value).
To provide pretty look of the triangle, the cell size must be an even number.
The "basic" indent is the half of this size.
The actual indent for each row is this basic size, multiplied by a respective
number, derived from the row index and the total number of rows.
The whole script is given below:
use strict;
use warnings;
use feature qw(say);
use POSIX qw(ceil);
my $rowCnt = 14; # How many rows
say "Pascal Triangle with $rowCnt rows:";
# Rows container, filled with a single row (containing single 1)
my #rows = ([ 1 ]);
my ($lastRow, $row, $ind);
# Generate / add further rows
for ($ind = 1; $ind < $rowCnt; $ind++) {
$lastRow = $rows[$#rows]; # Last row gathered so far
push(#rows, getNextRow($lastRow));
}
$lastRow = $rows[$#rows];
# Middle elem. of the last row
my $midElem = $$lastRow[($rowCnt - 1) / 2];
# No of digits + separator, rounded up to even
my $elemSize = ceil((length($midElem) + 1) / 2) * 2;
my $shf = $elemSize / 2; # Shift size for a sigle step
# Print rows
for ($ind = 0; $ind < $rowCnt; $ind++) {
my $row = $rows[$ind];
my $spc = $shf * ($rowCnt - $ind - 1);
printRow($spc, $row, $elemSize);
}
sub getNextRow { # Create the next row and return the reference to it
my $lastRow = $_[0]; # Read param
my #row = (1); # Start the new row from a single 1
for (my $i = 0; $i < $#$lastRow; $i++) {
push(#row, $$lastRow[$i] + $$lastRow[$i + 1]);
}
push(#row, 1); # Add terminating 1
return \#row; # Result - reference to the created row
}
sub printRow { # Print a row of the triangle
my ($leadSpc, $row, $elemSize) = #_; # Read params
# Leading spaces and the initial element (always 1)
printf("%s1", ' ' x $leadSpc);
# Print the rest of the row
for (my $i = 1; $i <= $#$row; $i++) {
printf("%*d", $elemSize, $$row[$i]);
}
print("\n");
}

Related

Using perl, given an array of any size, how do I randomly pick 1/4 of the list

For clarification, if I had a list of 8 elements, i would want to randomly pick 2. If I had a list of 20 elements, I would want to randomly pick 5. I would also like to assure (though not needed) that two elements don't touch, i.e. if possible not the 3 and then 4 element. Rather, 3 and 5 would be nicer.
The simplest solution:
Shuffle the list
select the 1st quarter.
Example implementation:
use List::Util qw/shuffle/;
my #nums = 1..20;
my #pick = (shuffle #nums)[0 .. 0.25 * $#nums];
say "#pick";
Example output: 10 2 18 3 19.
Your additional restriction “no neighboring numbers” actually makes this less random, and should be avoided if you want actual randomness. To avoid that two neighboring elements are included in the output, I would iteratively splice unwanted elements out of the list:
my #nums = 1..20;
my $size = 0.25 * #nums;
my #pick;
while (#pick < $size) {
my $i = int rand #nums;
push #pick, my $num = $nums[$i];
# check and remove neighbours
my $len = 1;
$len++ if $i < $#nums and $num + 1 == $nums[$i + 1];
$len++, $i-- if 0 < $i and $num - 1 == $nums[$i - 1];
splice #nums, $i, $len;
}
say "#pick";
use strict;
use warnings;
sub randsel {
my ($fact, $i, #r) = (1.0, 0);
while (#r * 4 < #_) {
if (not grep { $_ == $i } #r) {
$fact = 1.0;
# make $fact = 0.0 if you really don't want
# consecutive elements
$fact = 0.1 if grep { abs($i - $_) == 1 } #r;
push(#r, $i) if (rand() < 0.25 * $fact);
}
$i = ($i + 1) % #_;
}
return map { $_[$_] } sort { $a <=> $b } #r;
}
my #l;
$l[$_] = $_ for (0..19);
print join(" ", randsel(#l)), "\n";

Print Armstrong numbers between 1 to 10 million

How to write a logic using for loop or while loop for printing Armstrong numbers?
Someone kindly explain how to print Armstrong numbers between 1 to 1,00,00,000.
This the algorithm that I followed
step 1 : initializing variable min,max,n,sum,r,t
step 2 : my $n = <>;
step 3 : to find base of $n
step 4 : using for loop
( for (n = min; n < max ; n++ )
step 5 : some logic like
n=t,sum =0,r=t%10,t=n/10,
step 6 :
sum = sum + (n ^ base );
step 6 :
if ( sum == num ) print Armstrong numbers else not.
I tried to code this my code look like this
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
my $n;
chomp($n);
my $min = 1;
my $max = 10000000
my $r;
my $sum;
my $t;
my $base = length($n);
print "base is $base\n";
for ($n = $min; $n <= $max; $n++) {
$t = $n;
$sum = 0;
while ($t != 0) {
$r = $t % 10;
$t = $t / 10;
{
$sum = $sum + ($base * $r);
}
if ($sum == $n) {
print "$n\n";
}
}
}
Several things:
It's bad practice to declare something with my until you need it.
You must remember that numbers are also strings, and can be manipulated by string functions like split.
C-like loops are discouraged in Perl because they're hard to read.
Constants should be ...well... constant.
Here's my attempt. I use split to split up my digits into an array of digits. This is a lot easier than dividing constantly by ten. I can get the number of digits by simply taking the scalar value of my #digits array.
I can then loop through #digits, taking each one to the power of $power and adding it to sum. I use the map command for this loop, but I could have used another for loop too.
#! /usr/bin/env perl
#
use strict;
use warnings;
use feature qw(say);
use constant {
MIN => 1,
MAX => 1_000_000,
};
for my $number ( (+MIN..+MAX) ) {
my #digits = split //, $number;
my $power = #digits;
my $sum = 0;
map { $sum += $_**$power } #digits;
if ( $sum == $number ) {
say "$number is an Armstrong number";
}
}
And my output:
1 is an Armstrong number
2 is an Armstrong number
3 is an Armstrong number
4 is an Armstrong number
5 is an Armstrong number
6 is an Armstrong number
7 is an Armstrong number
8 is an Armstrong number
9 is an Armstrong number
153 is an Armstrong number
370 is an Armstrong number
371 is an Armstrong number
407 is an Armstrong number
1634 is an Armstrong number
8208 is an Armstrong number
9474 is an Armstrong number
54748 is an Armstrong number
92727 is an Armstrong number
93084 is an Armstrong number
548834 is an Armstrong number
Took a bit over five seconds to run.
Instead of map, I could have done this loop:
for my $digit ( #digits ) {
$sum = $sum + ( $digit ** $power);
}
Did this one at university...
I dug out the one I made in C and converted it to perl for you (it may not be the best way to do this, but it is the way I did it):
#!/usr/bin/env perl
use strict;
use warnings;
my $min = 1;
my $max = 10000000;
for (my $number = $min; $number <= $max; $number++) {
my #digits = split('', $number);
my $sum = 0;
foreach my $digit (#digits) {
$sum += $digit**length($number);
}
if ($sum == $number) {
print "$number\n";
}
}
(Demo - 1 to 9999 due to execution time limit)
Your code seems to be right, but you have some kind of problems with your start. For example you dont read from STDIN or from #ARGV. Would you do that, you just have a small problem with your calculating of the exponential calculation. In most Programming Languages, the syntax for a exponential calculation is ** or a pow() function.
I really dont understand, for what this part is:
while ($t != 0) {
$r = $t % 10;
$t = $t / 10;
{
$sum = $sum + ($base * $r);
}
if ($sum == $n) {
print "$n\n";
}
}
For what is the naked block? Why do you use the modulus? .. Well i give you a small code for calculating the armstrong numbers with bases of 1..100, between 0 and 10million:
#!/usr/bin/perl
use strict;
use warnings;
foreach my $base (0..100) { # use the foreach loop as base
for my $num (0..10_000_000) { # and use numbers between this range
my $ce=0; # ce = calculated exp.
foreach my $num2 (split //,$num ) { # split each number for calc each
$ce += $num2 ** $base; # exp. and adding it (see algorithm)
}
if ($num == $ce) { # when the exp. num and the number
print "$base => $num\n"; # itself equals, its a armstrong number
} # print that
}
}

Trying to figure out if there's a shorter/better way to implement a conditional-sum-like function

Warning: Project Euler Problem 1 Spoiler
I recently discovered Project Euler and decided to try a few of the problems. The first problem was to sum the numbers from 0-999 that are multiples of 3 or 5.
My first, "java-like" solution was:
print threeAndFive(1000)."\n";
# Returns the sum of the numbers less than $max that are multiples of 3 or 5
sub threeAndFive
{
my $max = shift;
my $sum = 0;
for (my $i=; $i < $max; $i++)
{
$sum+=$i if (validate($i));
}
return $sum;
}
sub validate
{
my $num = shift;
if ($num % 3 == 0 || $num % 5 == 0)
{
return 1;
}
return undef;
}
I then rewrote it in a more perlish fashion:
print eval(join ('+', map {($_ % 3 == 0 || $_ % 5 == 0) ? $_ : ()} (1 .. 999)));
While this is obviously way more concise than the original code, I feel that it can probably be shorter or done in a better fashion. For example, in Python, one can do:
print sum([i for i in range(1,1000) if i%3==0 or i%5==0])
Are there more concise/better/clearer ways to do this? Or other equivalent ways that use different functions? I'm interested in learning as much perl as I can, so the more solutions, the merrier.
Thanks in advance.
The Straightforward Approach
To answer your question, List::Util provides sum.
use List::Util qw( sum );
Or you could write your own
sub sum { my $acc; $acc += $_ for #_; $acc }
Then you get:
say sum grep { $_ % 3 == 0 || $_ % 5 == 0 } 0..999;
Of course, that's an unoptimised approach.
The Optimised Approach
You can easily reduce the above to Ω(1) memory from Ω(N) by using a counting loop.
my $acc;
for (1..999) { $acc += $_ if $_ % 3 == 0 || $_ % 5 == 0; }
say $acc;
But that's far from the best, since the result can be obtained in Ω(1) time and memory!
This is done by adding the sum of the multiples of 3 to the sum of the multiples of 5, then subtracting the sum of the multiples of 15, because the sums of the multiples of $x can be calculated using
( sum 1..floor($n/$x) ) * $x # e.g. 3+6+9+... = (1+2+3+...)*3
which can take advantage of the formula
sum 1..$n = $n * ($n+1) * 0.5
Less concise, but faster:
sub sum1toN { my $N = int(shift); ($N * ($N+1)) / 2; }
my $N = 999;
print sum1toN($N/3)*3 + sum1toN($N/5)*5 - sum1toN($N/15)*15, "\n";
The sum1toN function computes the sum of integers from 1 to N.
Since:
3 + 6 + 9 + 12 ... + 999
Equals:
(1 + 2 + 3 + ... 333 ) * 3
We can computes sum of multiples of 3 using sum1toN(N/3) * 3. And the same applies to 5. Note that since we count the multiples of by 15 in both cases, a subtraction of sum1toN(N/15)*15 is needed.

Perl to count current value based on next value

Currently I'm learning Perl and gnuplot. I would like to know how to count certain value based on the next value. For example I have a text file consist of:
#ID(X) Y
1 1
3 9
5 11
The output should show the value of the unknown ID as well. So, the output should show:
#ID(X) Y
1 1
2 5
3 9
4 10
5 11
The Y of ID#2 is based on the following:
((2-3)/(1-3))*1 + ((2-1)/(3-1))*9 which is linear algebra
Y2=((X2-X3)/(X1-X3))*Y1 + ((X2-X1)/(X3-X1)) * Y3
Same goes to ID#5
Currently I have this code,
#! /usr/bin/perl -w
use strict;
my $prev_id = 0;
my $prev_val = 0;
my $next_id;
my $next_val;
while (<>)
{
my ($id, $val) = split;
for (my $i = $prev_id + 1; $i < $next_id; $i++)
{
$val = (($id - $next_id) / ($prev_id - $next_id)) * $prev_val + (($id - $prev_id) / ($next_id - $prev_id)) * $next_val;
printf ("%d %s\n", $i, $val);
}
printf ("%d %s\n", $id, $val);
($prev_val, $prev_id) = ($val, $id);
($next_val, $next_id) = ($prev_val, $prev_id);
}
Your formula seems more complicated than I would expect, given that you are always dealing with integer spacings of 1.
You did not say whether you want to fill gaps for multiple consecutive missing values, but let's assume you want to.
What you do is read in the first line, and say that's the current one and you output it. Now you read the next line, and if its ID is not the expected one, you fill the gaps with simple linear interpolation...
Pseudocode
(currID, currY) = readline()
outputvals( currID, currY )
while lines remain do
(nextID, nextY) = readline()
gap = nextID - currID
for i = 1 to gap
id = currID + i
y = currY + (nextY - currY) * i / gap
outputvals( id, y )
end
(currID, currY) = (nextID, nextY)
end
Sorry for the non-Perl code. It's just that I haven't been using Perl for ages, and can't remember half of the syntax. =) The concepts here are pretty easy to translate into code though.
Using an array may be the way to go. This will also make your data available for further manipulation.
** Caveat: will not work for multiple consecutive missing values of y; see #paddy's answer.
#!/usr/bin/perl
use strict;
use warnings;
my #coordinates;
while (<DATA>) {
my ($x, $y) = split;
$coordinates[$x] = $y;
}
# note that the for loop starts on index 1 here ...
for my $x (1 .. $#coordinates) {
if (! $coordinates[$x]) {
$coordinates[$x] = (($x - ($x + 1)) / (($x - 1) - ($x + 1)))
* $coordinates[$x - 1]
+ (($x - ($x - 1)) / (($x + 1) - ($x - 1)))
* $coordinates[$x + 1];
}
print "$x - $coordinates[$x]\n";
}
__DATA__
1 1
3 9
5 11
You indicated your problem is getting the next value. The key isn't to look ahead, it's to look behind.
my $prev = get first value;
my ($prev_a, $prev_b) = parse($prev);
my $this = get second value;
my ($this_a, $this_b) = parse($this);
while ($next = get next value) {
my ($next_a, $next_b) = parse($next);
...
$prev = $this; $prev_a = $this_a; $prev_b = $this_b;
$this = $next; $this_a = $next_a; $this_b = $next_b;
}
#! /usr/bin/perl -w
use strict;
my #in = (1,9,11);
my #out;
for (my $i = 0; $i<$#in; $i++) {
my $j = $i*2;
my $X1 = $i;
my $X2 = $i+1;
my $X3 = $i+2;
my $Y1 = $in[$i];
my $Y3 = $in[$i+1];
my $Y2 = $Y1*(($X2-$X3)/($X1-$X3))
+ $Y3*(($X2-$X1)/($X3-$X1));
$out[$j] = $in[$i];
$out[$j+1] = $Y2;
}
$out[$#in*2] = $in[$#in];
print (join " ",#out);

Capturing Non-Zero Elements, Counts and Indexes of Sparse Matrix

I have the following sparse matrix A.
2 3 0 0 0
3 0 4 0 6
0 -1 -3 2 0
0 0 1 0 0
0 4 2 0 1
Then I would like to capture the following information from there:
cumulative count of entries, as matrix is scanned columnwise.
Yielding:
Ap = [ 0, 2, 5, 9, 10, 12 ];
row indices of entries, as matrix is scanned columnwise.
Yielding:
Ai = [0, 1, 0, 2, 4, 1, 2, 3, 4, 2, 1, 4 ];
Non-zero matrix entries, as matrix is scanned columnwise.
Yielding:
Ax = [2, 3, 3, -1, 4, 4, -3, 1, 2, 2, 6, 1];
Since the actual matrix A is potentially very2 large, is there any efficient way
in Perl that can capture those elements? Especially without slurping all matrix A
into RAM.
I am stuck with the following code. Which doesn't give what I want.
use strict;
use warnings;
my (#Ax, #Ai, #Ap) = ();
while (<>) {
chomp;
my #elements = split /\s+/;
my $i = 0;
my $new_line = 1;
while (defined(my $element = shift #elements)) {
$i++;
if ($element) {
push #Ax, 0 + $element;
if ($new_line) {
push #Ai, scalar #Ax;
$new_line = 0;
}
push #Ap, $i;
}
}
}
push #Ai, 1 + #Ax;
print('#Ax = [', join(" ", #Ax), "]\n");
print('#Ai = [', join(" ", #Ai), "]\n");
print('#Ap = [', join(" ", #Ap), "]\n");
A common strategy for storing sparse data is to drop the values you don't care about (the zeroes) and to store the row and column indexes with each value that you do care about, thus preserving their positional information:
[VALUE, ROW, COLUMN]
In your case, you can economize further since all of your needs can be met by processing the data column-by-column, which means we don't have to repeat COLUMN for every value.
use strict;
use warnings;
use Data::Dumper;
my ($r, $c, #dataC, #Ap, #Ai, #Ax, $cumul);
# Read data row by row, storing non-zero values by column.
# $dataC[COLUMN] = [
# [VALUE, ROW],
# [VALUE, ROW],
# etc.
# ]
$r = -1;
while (<DATA>) {
chomp;
$r ++;
$c = -1;
for my $v ( split '\s+', $_ ){
$c ++;
push #{$dataC[$c]}, [$v, $r] if $v;
}
}
# Iterate through the data column by column
# to compute the three result arrays.
$cumul = 0;
#Ap = ($cumul);
$c = -1;
for my $column (#dataC){
$c ++;
$cumul += #$column;
push #Ap, $cumul;
for my $value (#$column){
push #Ax, $value->[0];
push #Ai, $value->[1];
}
}
__DATA__
2 3 0 0 0
3 0 4 0 6
0 -1 -3 2 0
0 0 1 0 0
0 4 2 0 1
This is what you are looking for, I guess:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper::Simple;
my #matrix;
# Populate #matrix
while (<>) {
push #matrix, [ split /\s+/ ];
}
my $columns = #{ $matrix[0] };
my $rows = #matrix;
my ( #Ap, #Ai, #Ax );
my $ap = 0;
for ( my $j = 0 ; $j <= $rows ; $j++ ) {
for ( my $i = 0 ; $i <= $columns ; $i++ ) {
if ( $matrix[$i]->[$j] ) {
$ap++;
push #Ai, $i;
push #Ax, $matrix[$i]->[$j];
}
}
push #Ap, $ap;
}
print Dumper #Ap;
print Dumper #Ai;
print Dumper #Ax;
Updated based on FM's comment. If you do not want to store any of the original data:
#!/usr/bin/perl
use strict;
use warnings;
my %matrix_info;
while ( <DATA> ) {
chomp;
last unless /[0-9]/;
my #v = map {0 + $_ } split;
for (my $i = 0; $i < #v; ++$i) {
if ( $v[$i] ) {
push #{ $matrix_info{$i}->{indices} }, $. - 1;
push #{ $matrix_info{$i}->{nonzero} }, $v[$i];
}
}
}
my #cum_count = (0);
my #row_indices;
my #nonzero;
for my $i ( sort {$a <=> $b } keys %matrix_info ) {
my $mi = $matrix_info{$i};
push #nonzero, #{ $mi->{nonzero} };
my #i = #{ $mi->{indices} };
push #cum_count, $cum_count[-1] + #i;
push #row_indices, #i;
}
print(
"\#Ap = [#cum_count]\n",
"\#Ai = [#row_indices]\n",
"\#Ax = [#nonzero]\n",
);
__DATA__
2 3 0 0 0
3 0 4 0 6
0 -1 -3 2 0
0 0 1 0 0
0 4 2 0 1
Output:
C:\Temp> m
#Ap = [0 2 5 9 10 12]
#Ai = [0 1 0 2 4 1 2 3 4 2 1 4]
#Ax = [2 3 3 -1 4 4 -3 1 2 2 6 1]
Ap is easy: simply start with zeroes and increment each time you meet a nonzero number. I don't see you trying to write anything into #Ap, so it's no surprise it doesn't end up as you wish.
Ai and Ax are trickier: you want a columnwise ordering while you're scanning rowwise. You won't be able to do anything in-place since you don't know yet how many elements the columns will yield, so you can't know in advance the elements' position.
Obviously, it would be a hell lot easier if you could just alter the requirement to have a rowwise ordering instead. Failing that, you could get complex and collect (i, j, x) triplets. While collecting, they'd naturally be ordered by (i, j). Post-collection, you'd just want to sort them by (j, i).
The code you provided works on a row-by-row basis. To get results sequential by columns you have to accumulate your values into separate arrays, one for each column:
# will look like ([], [], [] ...), one [] for each column.
my #columns;
while (<MATRIX>) {
my #row = split qr'\s+';
for (my $col = 0; $col < #row; $col++) {
# push each non-zero value into its column
push #{$columns[$col]}, $row[$col] if $row[$col] > 0;
}
}
# now you only need to flatten it to get the desired kind of output:
use List::Flatten;
#non_zero = flat #columns;
See also List::Flatten.