List::Util - reduce - length - encoding - question - perl

Why do I get a wrong result with the first reduce example?
test.txt
__BE
bb bbbbbbbbbbbbbbb
aaaaaa
test.pl
#!/usr/bin/env perl
use warnings; use 5.012;
use open ':encoding(UTF-8)';
use List::Util qw(reduce);
use Encode;
my( #list, $longest, $len );
open my $fh, '<', 'test.txt' or die $!;
while( my $line = readline( $fh ) ) {
chomp $line;
push #list, split( /\s+/, $line );
}
close $fh;
$longest = reduce{ length($a) > length($b) ? $a : $b } #list;
$len = length $longest;
say $longest; # aaaaaa
say $len; # 6
$longest = reduce{ length(Encode::encode_utf8($a)) > length(Encode::encode_utf8($b)) ? $a : $b } #list;
$len = length(Encode::encode_utf8($longest));
say $longest; # bbbbbbbbbbbbbbb
say $len; # 15
$longest = $list[0];
$len = length $longest;
for my $str (#list) {
if ( length($str) > $len ) {
$longest = $str;
$len = length($str);
}
}
say $longest; # bbbbbbbbbbbbbbb
say $len; # 15

AFAICS, it might even be a bug in Perl...it certainly isn't obvious that it is behaving correctly. I modified the first reduce to print diagnostics as it goes:
#!/usr/bin/env perl
use warnings; use 5.012;
use open ':encoding(UTF-8)';
use List::Util qw(reduce);
use Encode;
my( #list, $longest, $len );
open my $fh, '<', 'test.txt' or die $!;
while( my $line = readline( $fh ) ) {
chomp $line;
push #list, split( /\s+/, $line );
}
close $fh;
$longest = reduce { say "<<$a>>/<<$b>> : ", length($a), " : ", length($b);
length($a) > length($b) ? $a : $b } #list;
$len = length $longest;
say $longest; # aaaaaa
say $len; # 6
$longest = reduce { length(Encode::encode_utf8($a)) > length(Encode::encode_utf8($b)) ? $a : $b } #list;
$len = length(Encode::encode_utf8($longest));
say $longest; # bbbbbbbbbbbbbbb
say $len; # 15
$longest = $list[0];
$len = length $longest;
for my $str (#list) {
if ( length($str) > $len ) {
$longest = $str;
$len = length($str);
}
}
say $longest; # bbbbbbbbbbbbbbb
say $len; # 15
When run on MacOS X (10.6.5) using Perl 5.13.4, the output I get is:
<<>>/<<__BE>> : 0 : 4
<<__BE>>/<<>> : 0 : 0
<<>>/<<bb>> : 0 : 2
<<bb>>/<<bbbbbbbbbbbbbbb>> : 0 : 15
<<bbbbbbbbbbbbbbb>>/<<>> : 0 : 0
<<>>/<<aaaaaa>> : 0 : 6
aaaaaa
6
bbbbbbbbbbbbbbb
15
bbbbbbbbbbbbbbb
15
To all appearances, the first argument to the first reduce is always a zero length string, even on those odd occasions when it contains some data.
If the 'use open ':encoding(UTF-8)';' line is removed, then it behaves sanely.
<<>>/<<__BE>> : 0 : 4
<<__BE>>/<<>> : 4 : 0
<<__BE>>/<<bb>> : 4 : 2
<<__BE>>/<<bbbbbbbbbbbbbbb>> : 4 : 15
<<bbbbbbbbbbbbbbb>>/<<>> : 15 : 0
<<bbbbbbbbbbbbbbb>>/<<aaaaaa>> : 15 : 6
bbbbbbbbbbbbbbb
15
bbbbbbbbbbbbbbb
15
bbbbbbbbbbbbbbb
15
That might suggest that the bug is somewhere in the interaction of file I/O, UTF-8 encoding and List::Util. On the other hand, it could be somewhere more obscure. But my impression is that you have a test case that is reproducible and could be reported as a possible bug somewhere in Perl and its core modules.

I've reported this as bug in List::Util after trying to modify this program.

Related

Pick up the longest peptide using perl

I want to find out the longest possible protein sequence translated from cds in 6 forward and reverse frame.
This is the example input format:
>111
KKKKKKKMGFSOXLKPXLLLLLLLLLLLLLLLLLMJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX
>222
WWWMPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPXKKKKKK
I would like to find out all the strings which start from "M" and stop at "X", count the each length of the strings and select the longest.
For example, in the case above:
the script will find,
>111 has two matches:
MGFSOX
MJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX
>222 has one match:
MPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPX
Then count each match's length, and print the string and number of longest matches which is the result I want:
>111
MJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX 32
>222
MPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPX 38
But it prints out no answer. Does anyone know how to fix it? Any suggestion will be helpful.
#!/usr/bin/perl -w
use strict;
use warnings;
my #pep=();
my $i=();
my #Xnum=();
my $n=();
my %hash=();
my #k=();
my $seq=();
$n=0;
open(IN, "<$ARGV[0]");
while(<IN>){
chomp;
if($_=~/^[^\>]/){
#pep=split(//, $_);
if($_ =~ /(X)/){
push(#Xnum, $1);
if($n >= 0 && $n <= $#Xnum){
if(#pep eq "M"){
for($i=1; $i<=$#pep; $i++){
$seq=join("",#pep);
$hash{$i}=$seq;
push(#k, $i);
}
}
elsif(#pep eq "X"){
$n=$n+1;
}
foreach (sort {$a cmp $b} #k){
print "$hash{$k[0]}\t$k[0]";
}
}
}
}
elsif($_=~/^\>/){
print "$_\n";
}
}
close IN;
Check out this Perl one-liner
$ cat iris.txt
>111
KKKKKKKMGFSOXLKPXLLLLLLLLLLLLLLLLLMJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX
>222
WWWMPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPXKKKKKK
$ perl -ne ' if(!/^>/) { print "$p"; while(/(M[^M]+?X)/g ) { if(length($1)>length($x)) {$x=$1 } } print "$x ". length($x)."\n";$x="" } else { $p=$_ } ' iris.txt
>111
MJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX 32
>222
MPPPPPX 7
$
There's more than one way to do it!
Try this too:
print and next if /^>/;
chomp and my #z = $_ =~ /(M[^X]*X)/g;
my $m = "";
for my $s (#z) {
$m = $s if length $s > length $m
}
say "$m\t" . length $m
Output:
>111
MJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX 32
>222
MPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPX 38
uses >=5.14 and make sure to run script with perl -n
As a one-liner:
perl -E 'print and next if /^>/; chomp and my #z = $_ =~ /(M[^X]*X)/g; my $m = ""; for my $s (#z) { $m = $s if length $s > length $m } say "$m\t" . length $m' -n data.txt
Here is solution using reduce from List::Util.
Edit: mistakenly used maxstr which gave results but is not what was needed. Have reedited this post to use reduce (correctly) instead.
#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw/reduce/;
open my $fh, '<', \<<EOF;
>111
KKKKKKKMGFSOXLKPXLLLLLLLLLLLLLLLLLMJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJX
>222
WWWMPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPMPPPPPXKKKKKK
EOF
my $id;
while (<$fh>) {
chomp;
if (/^>/) {
$id = $_;
}
else {
my $data = reduce {length($a) > length($b) ? $a : $b} /M[^X]*X/g;
print "$id\n$data\t" . length($data) . "\n" if $data;
}
}
Here's my take on it.
I like fasta files tucked into a hash, with the fasta name as the key. This way you can just add descriptions to it, e.g. base composition etc...
#!/usr/local/ActivePerl-5.20/bin/env perl
use strict;
use warnings;
my %prot;
open (my $fh, '<', '/Users/me/Desktop/fun_prot.fa') or die $!;
my $string = do { local $/; <$fh> };
close $fh;
chomp $string;
my #fasta = grep {/./} split (">", $string);
for my $aa (#fasta){
my ($key, $value) = split ("\n", $aa);
$value =~ s/[A-Z]*(M.*M)[A-Z]/$1/;
$prot{$key}->{'len'} = length($value);
$prot{$key}->{'prot'} = $value;
}
for my $sequence (sort { $prot{$b}->{'len'} <=> $prot{$a}->{'len'} } keys %prot){
print ">" . $sequence, "\n", $prot{$sequence}->{'prot'}, "\t", $prot{$sequence}->{'len'}, "\n";
last;
}
__DATA__
>1232
ASDFASMJJJJJMFASDFSDAFSDDFSA
>2343
AASFDFASMJJJJJJJJJJJJJJMRGQEGDAGDA
Output
>2343
MJJJJJJJJJJJJJJM 16

Display text file 5 lines at a time in Perl

I'm not sure how to do this.
I have a file (which will never be large, so won't need a module) and want to break it down so that I can display it on my web page 5 lines per row.
This is as far as I have got.
$row="5";
#DD=<DATA>;
foreach $line (#DD) {
$count++;
chomp($line);
if ($count <= $row) {
print qq~$line ~; # This shows5, but don't know what to do next.
}
}
exit;
__DATA__
aaaa
bbbb
cccc
dddd
eeee
ffff
gggg
hhhh
iiii
jjjj
kkkk
llll
mmmm
Expected result (should be in 3 line but your forum software won't let me)
aaaa bbbb cccc dddd eeee
ffff gggg hhhh iiii jjjj
kkkk llll mmmm
Could someone help please?
You'd have to reset the count and print a new line at 5.
print qq~$line~;
if ( $count == $row ) {
print "\n";
$count = 0;
}
else {
print ' ';
}
However, easier still is a modulus:
use strict;
use warnings;
my $row = 5;
my $count = 0;
foreach my $line ( <DATA> ) {
chomp( $line );
print $line, ++$count % $row ? ' ' : "\n";
}
If $count is a multiple of $row print a newline, else print a space.
When you reach the limit (5) reset the counter to 0 and print a newline
$count = 0;
print "\n";
BTW there are a number of improvements you could do to your code, but the most important would be to use strict and warnings
I think this will work:
use strict;
use warnings;
my $rows = 5;
my $count = 0;
my #lines = <DATA>;
chomp #lines;
foreach my $line (#lines) {
$count++;
if ($count <= $rows) {
print qq{$line };
} else {
$count = 0;
print "\n";
}
}
There are many problems with your code. See my comments below.
#!/usr/bin/env perl
use strict;
use warnings;
my $threshold = 5;
my #buffer;
while (my $line = <DATA>) {
$line =~ s/\s\z//;
push #buffer, $line;
if (#buffer % $threshold == 0) {
print join(' ', #buffer), "\n";
#buffer = ();
}
}
#buffer
and print join(' ', #buffer), "\n";
__DATA__
aaaa
bbbb
cccc
dddd
eeee
ffff
gggg
hhhh
iiii
jjjj
kkkk
llll
mmmm
Here is a list of things you should think about:
First, You should use strict and warnings.
$row="5";
$row is intended to be used as numeric variable. Why assign a string to it?
#DD=<DATA>;
foreach $line (#DD) {
No need to create an extra array by slurping, of all things, your __DATA__ section. Instead, use while and read line-by-line.
$count++;
Perl's builtin $. counts the number of lines read. No need for an additional variable.
For variety: If you insist on slurping, you can slurp into a string:
#!/usr/bin/env perl
use strict;
use warnings;
my $threshold = 5;
my $contents = do { local $/; <DATA> };
while ($contents) {
($contents, my #fields) = reverse split(qr{\n}, $contents, $threshold + 1);
print join(' ', reverse #fields), "\n";
}
or, continue to slurp into an array and use splice:
#!/usr/bin/env perl
use strict;
use warnings;
my $threshold = 5;
my #contents = <DATA>;
while (#contents) {
print join(' ', map { chomp; $_ } splice #contents, 0, $threshold), "\n";
}
# always start your Perl 5 files with these
# two pragmas until you know exactly why they
# are recommended
use strict;
use warnings;
my $row = 5;
while ( <> ){
chomp;
print;
print $. % $row ? ' ' : "\n";
}
# makes sure there is always a trailing newline
print "\n" if $. % $row;
$ time ./example.pl /usr/share/dict/words
...
real 0m2.217s
user 0m0.097s
sys 0m0.084s
In Perl 6 I would probably write it as:
'filename'.IO.lines.rotor(5, :partial).map: *.say;
( currently takes about 15 seconds to process /usr/share/dict/words under the Moar backend, but it hasn't had 20 years of optimizations applied to it like Perl 5 has. It may be faster with the JVM backend )

Take random substrings from genome data

I am trying to use the substring function to take random 21 base sequences from a genome in fasta format. Below is the start of the sequence:
FILE1 data:
>gi|385195117|emb|HE681097.1| Staphylococcus aureus subsp. aureus HO 5096 0412 complete genome
CGATTAAAGATAGAAATACACGATGCGAGCAATCAAATTTCATAACATCACCATGAGTTTGGTCCGAAGCATGAGTGTTTACAATGTTTGAATACCTTATACAGTTCTTATACATAC
I have tried adapting a previous answer to use while reading my file and i'm not getting any error messages, just no output! The code hopefully prevents there being any overlap of sequences, though the chances of that are very small anyway.
Code as follows:
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
my $outputfile = "/Users/edwardtickle/Documents/randomoutput.txt";
open FILE1, "/Users/edwardtickle/Documents/EMRSA-15.fasta";
open( OUTPUTFILE, ">$outputfile" );
while ( my $line = <FILE1> ) {
if ( $line =~ /^([ATGCN]+)/ ) {
my $genome = $1;
my $size = 21;
my $count = 5;
my $mark = 'X';
if ( 2 * $size * $count - $size - $count >= length($genome) ) {
my #substrings;
while ( #substrings < $count ) {
my $pos = int rand( length($genome) - $size + 1 );
push #substrings, substr( $genome, $pos, $size, $mark x $size )
if substr( $genome, $pos, $size ) !~ /\Q$mark/;
for my $random (#substrings) {
print OUTPUTFILE "random\n";
}
}
}
}
}
Thanks for your help!
One of the neatest ways to select a random start point is to shuffle a list of all possible start points and select the first few -- as many as you need.
It's also best practice to use the three-parameter form of open, and lexical file handles.
The loop in this example starts much like your own -- picking up the genomes using a regex. The subsequences of length $size can start anywhere from zero up to $len_genome - $size, so the program generates a list of all these starting points, shuffles them using the utility function from List::Util, and puts them in #start_points.
Finally, if there are sufficient start points to form $count different subsequences, then they are printed, using substr in the print statement.
use strict;
use warnings;
use autodie;
use List::Util qw/ shuffle /;
my $outputfile = '/Users/edwardtickle/Documents/randomoutput.txt';
open my $in_fh, '<', '/Users/edwardtickle/Documents/EMRSA-15.fasta';
open my $out_fh, '>', $outputfile;
my $size = 21;
my $count = 5;
while (my $line = <$in_fh>) {
next unless $line =~ /^([ATGCN]+)/;
my $genome = $1;
my $len_genome = length $genome;
my #start_points = shuffle(0 .. $len_genome-$size);
next unless #start_points >= $count;
print substr($genome, $_, 21), "\n" for #start_points[0 .. $count-1];
}
output
TACACGATGCGAGCAATCAAA
GTTTACAATGTTTGAATACCT
ACATCACCATGAGTTTGGTCC
ATAACATCACCATGAGTTTGG
GGTCCGAAGCATGAGTGTTTA
I would recommend saving all possible positions for a substring in an array. That way you can remove possibilities after each substring to prevent overlap:
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
my $infile = "/Users/edwardtickle/Documents/EMRSA-15.fasta";
my $outfile = "/Users/edwardtickle/Documents/randomoutput.txt";
my $size = 21;
my $count = 5;
my $min_length = ( $count - 1 ) * ( 2 * $size - 1 ) + $size;
#open my $infh, '<', $infile;
#open my $outfh, '>', $outfile;
my $infh = \*DATA;
my $outfh = \*STDOUT;
while ( my $line = <$infh> ) {
next unless $line =~ /^([ATGCN]+)/;
my $genome = $1;
# Need a long enough sequence for multiple substrings with no overlap
if ( $min_length > length $genome ) {
warn "Line $., Genome too small: Must be $min_length, not ", length($genome), "\n";
next;
}
# Save all possible positions for substrings in an array. This enables us
# to remove possibilities after each substring to prevent overlap.
my #pos = ( 0 .. length($genome) - 1 - ( $size - 1 ) );
for ( 1 .. $count ) {
my $index = int rand #pos;
my $pos = $pos[$index];
# Remove from possible positions
my $min = $index - ( $size - 1 );
$min = 0 if $min < 0;
splice #pos, $min, $size + $index - $min;
my $substring = substr $genome, $pos, $size;
print $outfh "$pos - $substring\n";
}
}
__DATA__
>gi|385195117|emb|HE681097.1| Staphylococcus aureus subsp. aureus HO 5096 0412 complete genome
CGATTAAAGATAGAAATACACGATGCGAGCAATCAAATTTCATAACATCACCATGAGTTTGGTCCGAAGCATGAGTGTTTACAATGTTTGAATACCTTATACAGTTCTTATACATACCGATTAAAGATAGAAATACACGATGCGAGCAATCAAA
CGATTAAAGATAGAAATACACGATGCGAGCAATCAAATTTCATAACATCACCATGAGTTTGGTCCGAAGCATGAGTGTTTACAATGTTTGAATACCTTATACAGTTCTTATACATACCGATTAAAGATAGAAATACACGATGCGAGCAATCAAATTTCATAACATCACCATGAGTTTGGTCCGAAGCATGAGTGTTTACAATGTTTGAATACCTTATACAGTTCTTATACATAC
Outputs:
Line 2, Genome too small: Must be 185, not 154
101 - CAGTTCTTATACATACCGATT
70 - ATGAGTGTTTACAATGTTTGA
6 - AAGATAGAAATACACGATGCG
38 - TTCATAACATCACCATGAGTT
182 - GAAGCATGAGTGTTTACAATG
Alternative method for large genomes
You mentioned in a comment that genome may be 2 gigs in size. If that's the case then it's possible that there won't be enough memory to have an array of all possible positions.
Your original approach of substituting a fake character for each chosen substring would work in that case. The following is how I would do it, using redo:
for ( 1 .. $count ) {
my $pos = int rand( length($genome) - ( $size - 1 ) );
my $str = substr $genome, $pos, $size;
redo if $str =~ /X/;
substr $genome, $pos, $size, 'X' x $size;
print $outfh "$pos - $str\n";
}
Also note, that if your genome really is that big, then you must also be wary of the randbits setting of your Perl version:
$ perl -V:randbits
randbits='48';
For some Windows versions, the randbits setting was just 15, therefore only returning 32,000 possible random values: Why does rand($val) not warn when $val > 2 ** randbits?
I found it more effective to move the output for loop outside the inner while, and to add a condition to the while such that $genome must contain a $size-long chunk that hasn't already been partly selected.
Just because you've got a string that's 117 characters long doesn't mean you'll find 5 random non-overlapping chunks.
#!/usr/bin/perl
use strict;
use warnings;
use autodie;
my $outputfile = "/Users/edwardtickle/Documents/randomoutput.txt";
open FILE1, "/Users/edwardtickle/Documents/EMRSA-15.fasta";
open( OUTPUTFILE, ">$outputfile" );
while ( my $line = <FILE1> ) {
if ( $line =~ /^([ATGCN]+)/ ) {
my $genome = $1;
my $size = 21;
my $count = 5;
my $mark = 'X';
if ( 2 * $size * $count - $size - $count >= length($genome) ) {
my #substrings;
while ( #substrings < $count
and $genome =~ /[ATGCN]{$size}/ ) { # <- changed this
my $pos = int rand( length($genome) - $size + 1 );
push #substrings, substr( $genome, $pos, $size, $mark x $size )
if substr( $genome, $pos, $size ) !~ /\Q$mark/;
}
# v- changed this
print OUTPUTFILE "$_\n" for #substrings;
}
}
}

Vertical index Perl

File 1 has ranges 3-9, 2-6 etc
3 9
2 6
12 20
File2 has values: column 1 indicates the range and column 2 has values.
1 4
2 4
3 5
4 4
5 4
6 1
7 1
8 1
9 4
I would like to calculate the sum of values (file2, column2) for ranges in file1). Eg: If range is 3-9, then sum of values will be 5+4+4+1+1+1+4 = 20
What I have tried is:
open (FILE1,"file1.txt");
open (FILE2,"file2.txt");
#file1 = <FILE1>;
#file2 = <FILE2>;
foreach (#file1)
{
#split_file2 = split("\\s",$_); //splitting the file by space
foreach (#file2)
{
#split_file2 = split("\\s",$_); //splitting the file by space
if (#split_file1[0] == #split_file2[0]) //if column0 of file1 matches with column0 of file2
{
$x += #split_file2[1]; //sum the column1 of file2
if ( #split_file2[0] == #split_file1[0] ) //until column1 of file1 = column0 of file2.
{
last;
}
}
}}
Always use use strict; use warnings;.
split /\s/ is easier to read. split ' ' is what you actually want.
Don't use global variables (e.g. for file handles).
It's useful to check if open succeeds, if only by adding or die $!.
Use meaningful names, not file1 and file2.
use strict;
use warnings;
use feature qw( say );
use List::Util qw( sum );
my $file1 = 'file1.txt';
my $file2 = 'file2.txt';
my #file2;
{
open(my $fh, '<', $file2)
or die "Can't open $file2: $!\n";
while (<$fh>) {
my ($k, $v) = split;
$file2[$k] = $v;
}
}
{
open(my $fh, '<', $file1)
or die "Can't open $file1: $!\n";
while (<$fh>) {
my ($start, $end) = split;
say sum grep defined, #file2[$start .. $end];
}
}
Another solution:
#!/usr/bin/perl
use strict;
use warnings;
my $f1 = shift;
my $f2 = shift;
open FH1, "<", $f1 or die "$!\n";
open FH2, "<", $f2 or die "$!\n";
my %data;
while (<FH1>) {
$data{$1} = $2 if ($_ =~ m/^(\d+)\s+(\d+)$/);
}
while (<FH2>) {
if ($_ =~ m/^(\d+)\s+(\d+)$/) {
my $sum;
for ($1..$2) {
$sum += $data{$_} if defined($data{$_});
}
print "sum for $1-$2: $sum\n" if defined($sum);
}
}
close FH1;
close FH2;
Call: script.pl values.txt ranges.txt

Is there a Perl module for parsing numbers, including ranges?

Is there a module, which does this for me?
sample_input: 2, 5-7, 9, 3, 11-14
#!/usr/bin/env perl
use warnings; use strict; use 5.012;
sub aw_parse {
my( $in, $max ) = #_;
chomp $in;
my #array = split ( /\s*,\s*/, $in );
my %zahlen;
for ( #array ) {
if ( /^\s*(\d+)\s*$/ ) {
$zahlen{$1}++;
}
elsif ( /^\s*(\d+)\s*-\s*(\d+)\s*$/ ) {
die "'$1-$2' not a valid input $!" if $1 >= $2;
for ( $1 .. $2 ) {
$zahlen{$_}++;
}
} else {
die "'$_' not a valid input $!";
}
}
#array = sort { $a <=> $b } keys ( %zahlen );
if ( defined $max ) {
for ( #array ) {
die "Input '0' not allowed $!" if $_ == 0;
die "Input ($_) greater than $max not allowed $!" if $_ > $max;
}
}
return \#array;
}
my $max = 20;
print "Input (max $max): ";
my $in = <>;
my $out = aw_parse( $in, $max );
say "#$out";
A CPAN search for number range gives me this, which looks pretty much like what you're looking for:
Number::Range
Here's an example of how you can use the module in your aw_parse function:
$in =~ s/\s+//g; # remove spaces
$in =~ s/(?<=\d)-/../g; # replace - with ..
my $range = new Number::Range($in); # create the range
my #array = sort { $a <=> $b } $range->range; # get an array of numbers
Applied to the sample from the question:
Input (max 20): 2, 5-7, 9, 3, 11-14
2 3 5 6 7 9 11 12 13 14