Example of 'a subroutine may have several entry and exit points'

Example of 'a subroutine may have several entry and exit points' - subroutine

I'm reading the paper of Non structured programming, and found it says:
Unlike a procedure, a subroutine may have several entry and exit points, and a direct jump into or out of subroutine is (theoretically) allowed
I can't understand it, could anyone give me an code sample of:
a subroutine may have several entry and exit points
a direct jump into or out of subroutine
Thanks

10 A = 1
20 GOSUB 100
30 A = 2
40 GOSUB 110
50 A = 3
60 GOTO 130
70 END
100 PRINT A
110 PRINT "HELLO"
120 IF A = 1 THEN RETURN
130 PRINT "THERE"
140 IF A = 3 THEN GOTO 70
150 RETURN
The subroutine has three entry points (lines 100, 110, and 130) and three exit points (lines 120, 140, and 150). There is a direct jump into line 130 (from line 60) and a direct jump out (at line 140).

Related

How does mro, goto, and set_subname interact?

This is a complex question with regard to mro.pm, and the interplay with set_subname, and goto
In troubleshooting a problem, I think the core of my misunderstanding relates to how mro.pm works -- especially with regard to set_subname.
What is the difference between these three constructs,
Plain call to set_subname
*Foo::bar = set_subname( 'Foo::bar', $codeRef );
Anon sub which wraps a set_subname
*Foo::bar = sub {
my $codeRef2 = set_subname('Foo::bar', $codeRef);
goto $codeRef2
};
Anon sub which has its name set with set_subname
*Foo::bar = set_subname(
'Foo::bar',
sub { goto $codeRef }
);
Specifically, the Mojo test suits fails with either of these modifications with anon subs applied to Mojo::Utils's monkey_patch Running the two variants above against t/mojo/websocket_proxy.t,
With the 2 (second) option I have
*{"${class}::$k"} = sub {
my $cr = set_subname("${class}::$k", $patch{$k});
goto $cr;
};
And I get
Mojo::Reactor::Poll: Timer failed: Can't locate object method "send" via package "Mojo::Transaction::HTTP" at t/mojo/websocket_proxy.t line 66.
With the 3 (third) option I have,
*{"${class}::$k"} = set_subname("${class}::$k", sub { goto $patch{$k} })
And I get
No next::method 'new' found for Mojolicious::Routes at /usr/lib/x86_64-linux-gnu/perl/5.28/mro.pm line 30.
Obviously, the first version works (it's from the code I linked), the question is why are the other two variants giving me different errors (especially the second variant) and what's happening there -- why don't they work?

Your second option is not working because the sub you are using as a wrapper does not match the prototype of the inner sub. monkey_patch is not only used for methods, and this changes how some functions are parsed. In particular, Mojo::Util::steady_time has an empty prototype and is often called without using parenthesis.
*{"${class}::$k"} = Sub::Util::set_prototype(
Sub::Util::prototype( $patch{$k} ),
Sub::Util::set_subname(
"${class}::$k",
sub {
my $cr = Sub::Util::set_subname("${class}::$k", $patch{$k});
goto $cr;
}
)
);
The third construct is not working because you are using goto to remove the renamed wrapper sub from the call stack, leaving only the inner sub which has no name. This breaks next::method's ability to find the correct method name.

That is indeed complicated, but maybe you overcomplicated it?
Remember that MRO is only concerned with locating a method, which is just a symbol table entry to a coderef, through a defined order of package names. The internal subname only has to do with what caller() reports AFAIK.
From: Mojo
*{"${class}::$_"} = ## symbol table entry
set_subname("${class}::$_", ## an internal name
$patch{$_}) ## for a code ref
for keys %patch;
HTH
Edit After Seeing Error Messages:
The subroutines have not been validly installed. I suspect that since in option 2 and 3 you are deferring the calls to set_subname() to call time, the coderef $patch{$k} never has a subname assigned to it and that breaks a link in the chain of mro::_nextcan()'s XS magic. Particularly if $patch{$k} calls next::method. The closures seem to be valid though.
Although I must say my testing seems to show that option 2 is valid.
Enter command: my ($class, $k) = qw/W S/; my %patch = (S =>
sub {print "patch here\n"; decall;}); *{"${class}::$k"} =
sub { print "goto'r here\n"; my $cr = set_subname("${class}::$k",
$patch{$k}); goto $cr;};
Enter command: decall
0 "console"
1 "console.pl"
2 "114"
3 "(eval)"
4 "0"
5 0
6 "package W; decall"
7 ""
8 "256"
9 "\020\001\000\000\000P\004\000\000\000\000\000\000U\025U\005"
10 0
Enter command: S
goto'r here
patch here
0 "W"
1 "(eval 110)"
2 "1"
3 "W::S"
4 "1"
5 0
6 0
7 0
8 "256"
9 "\020\001\000\000\000P\004\000\000\000\000\000\000U\025U\005"
10 0
You might have to start looking farther afield for the problem with option 2.

After modifying Mojo/Util.pm with
foreach my $k (keys %patch) {
*{"${class}::$k"} = sub {
my $cr = set_subname("${class}::$k", $patch{$k});
goto $cr;
};
}
and isolating the test case, I get:
$ perl -MCarp::Always t/mojo/websocket_proxy2.t
Mojo::Reactor::Poll: Timer failed: Can't locate object method "send" via package "Mojo::Transaction::HTTP" at t/mojo/websocket_proxy2.t line 59.
main::__ANON__(Mojo::UserAgent=HASH(0x60280dad0), Mojo::Transaction::HTTP=HASH(0x6029ee7b0)) called at blib/lib/Mojo/UserAgent.pm line 252
Mojo::UserAgent::_finish(Mojo::UserAgent=HASH(0x60280dad0), "927210c53042c6142eda3f4010c8b17c", 1) called at blib/lib/Mojo/UserAgent.pm line 220
Mojo::UserAgent::_error(Mojo::UserAgent=HASH(0x60280dad0), "927210c53042c6142eda3f4010c8b17c", "Connect timeout") called at blib/lib/Mojo/UserAgent.pm line 128
Mojo::UserAgent::__ANON__(Mojo::IOLoop=HASH(0x601f9abb8), "Connect timeout", undef) called at blib/lib/Mojo/IOLoop.pm line 63
Mojo::IOLoop::__ANON__(Mojo::IOLoop::Client=HASH(0x601e34598)) called at blib/lib/Mojo/EventEmitter.pm line 15
Mojo::EventEmitter::emit(Mojo::IOLoop::Client=HASH(0x601e34598), "error", "Connect timeout") called at blib/lib/Mojo/IOLoop/Client.pm line 39
Mojo::IOLoop::Client::__ANON__(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/Reactor/Poll.pm line 143
eval {...} called at blib/lib/Mojo/Reactor/Poll.pm line 143
Mojo::Reactor::Poll::_try(Mojo::Reactor::Poll=HASH(0x6001a8390), "Timer", CODE(0x601e24ca0)) called at blib/lib/Mojo/Reactor/Poll.pm line 81
Mojo::Reactor::Poll::one_tick(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/Reactor/Poll.pm line 99
Mojo::Reactor::Poll::start(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/IOLoop.pm line 134
Mojo::IOLoop::start("Mojo::IOLoop") called at t/mojo/websocket_proxy2.t line 62
at blib/lib/Mojo/IOLoop.pm line 23.
Mojo::IOLoop::__ANON__(Mojo::Reactor::Poll=HASH(0x6001a8390), "Timer failed: Can't locate object method \"send\" via package \"Mojo::Transaction::HTTP\" at t/mojo/websocket_proxy2.t line 59.\x{a}\x{9}main::__ANON__(Mojo::UserAgent=HASH(0x60280dad0), Mojo::Transaction::HTTP=HASH(0x6029ee7b0)) called at blib/lib/Mojo/UserAgent.pm line 252\x{a}\x{9}Mojo::UserAgent::_finish(Mojo::UserAgent=HASH(0x60280dad0), \"927210c53042c6142eda3f4010c8b17c\", 1) called at blib/lib/Mojo/UserAgent.pm line 220\x{a}\x{9}Mojo::UserAgent::_error(Mojo::UserAgent=HASH(0x60280dad0), \"927210c53042c6142eda3f4010c8b17c\", \"Connect timeout\") called at blib/lib/Mojo/UserAgent.pm line 128\x{a}\x{9}Mojo::UserAgent::__ANON__(Mojo::IOLoop=HASH(0x601f9abb8), \"Connect timeout\", undef) called at blib/lib/Mojo/IOLoop.pm line 63\x{a}\x{9}Mojo::IOLoop::__ANON__(Mojo::IOLoop::Client=HASH(0x601e34598)) called at blib/lib/Mojo/EventEmitter.pm line 15\x{a}\x{9}Mojo::EventEmitter::emit(Mojo::IOLoop::Client=HASH(0x601e34598), \"error\", \"Connect timeout\") called at blib/lib/Mojo/IOLoop/Client.pm line 39\x{a}\x{9}Mojo::IOLoop::Client::__ANON__(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/Reactor/Poll.pm line 143\x{a}\x{9}eval {...} called at blib/lib/Mojo/Reactor/Poll.pm line 143\x{a}\x{9}Mojo::Reactor::Poll::_try(Mojo::Reactor::Poll=HASH(0x6001a8390), \"Timer\", CODE(0x601e24ca0)) called at blib/lib/Mojo/Reactor/Poll.pm line 81\x{a}\x{9}Mojo::Reactor::Poll::one_tick(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/Reactor/Poll.pm line 99\x{a}\x{9}Mojo::Reactor::Poll::start(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/IOLoop.pm line 134\x{a}\x{9}Mojo::IOLoop::start(\"Mojo::IOLoop\") called at t/mojo/websocket_proxy2.t line 62\x{a}") called at blib/lib/Mojo/EventEmitter.pm line 15
Mojo::EventEmitter::emit(Mojo::Reactor::Poll=HASH(0x6001a8390), "error", "Timer failed: Can't locate object method \"send\" via package \"Mojo::Transaction::HTTP\" at t/mojo/websocket_proxy2.t line 59.\x{a}\x{9}main::__ANON__(Mojo::UserAgent=HASH(0x60280dad0), Mojo::Transaction::HTTP=HASH(0x6029ee7b0)) called at blib/lib/Mojo/UserAgent.pm line 252\x{a}\x{9}Mojo::UserAgent::_finish(Mojo::UserAgent=HASH(0x60280dad0), \"927210c53042c6142eda3f4010c8b17c\", 1) called at blib/lib/Mojo/UserAgent.pm line 220\x{a}\x{9}Mojo::UserAgent::_error(Mojo::UserAgent=HASH(0x60280dad0), \"927210c53042c6142eda3f4010c8b17c\", \"Connect timeout\") called at blib/lib/Mojo/UserAgent.pm line 128\x{a}\x{9}Mojo::UserAgent::__ANON__(Mojo::IOLoop=HASH(0x601f9abb8), \"Connect timeout\", undef) called at blib/lib/Mojo/IOLoop.pm line 63\x{a}\x{9}Mojo::IOLoop::__ANON__(Mojo::IOLoop::Client=HASH(0x601e34598)) called at blib/lib/Mojo/EventEmitter.pm line 15\x{a}\x{9}Mojo::EventEmitter::emit(Mojo::IOLoop::Client=HASH(0x601e34598), \"error\", \"Connect timeout\") called at blib/lib/Mojo/IOLoop/Client.pm line 39\x{a}\x{9}Mojo::IOLoop::Client::__ANON__(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/Reactor/Poll.pm line 143\x{a}\x{9}eval {...} called at blib/lib/Mojo/Reactor/Poll.pm line 143\x{a}\x{9}Mojo::Reactor::Poll::_try(Mojo::Reactor::Poll=HASH(0x6001a8390), \"Timer\", CODE(0x601e24ca0)) called at blib/lib/Mojo/Reactor/Poll.pm line 81\x{a}\x{9}Mojo::Reactor::Poll::one_tick(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/Reactor/Poll.pm line 99\x{a}\x{9}Mojo::Reactor::Poll::start(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/IOLoop.pm line 134\x{a}\x{9}Mojo::IOLoop::start(\"Mojo::IOLoop\") called at t/mojo/websocket_proxy2.t line 62\x{a}") called at blib/lib/Mojo/Reactor/Poll.pm line 143
Mojo::Reactor::Poll::_try(Mojo::Reactor::Poll=HASH(0x6001a8390), "Timer", CODE(0x601e24ca0)) called at blib/lib/Mojo/Reactor/Poll.pm line 81
Mojo::Reactor::Poll::one_tick(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/Reactor/Poll.pm line 99
Mojo::Reactor::Poll::start(Mojo::Reactor::Poll=HASH(0x6001a8390)) called at blib/lib/Mojo/IOLoop.pm line 134
Mojo::IOLoop::start("Mojo::IOLoop") called at t/mojo/websocket_proxy2.t line 62
I can also confirm that setting the prototype fixes it.

Displaying human-readable text in perl Log::Report stack traces

A library that I'm using XML::Compile::Translate::Reader calls Log::Report's error method
or error __x"data for element or block starting with `{tag}' missing at {path}"
, tag => $label, path => $path, _class => 'misfit';
As I've got Log::Report set to debug mode, it returns a stack trace for an error.
[11 07 2014 22:17:39] [2804] error: data for element or block starting with `MSISDN' missing at {http://www.sigvalue.com/acc}TA
at /usr/local/share/perl5/XML/Compile/Translate/Reader.pm line 476
Log::Report::error("Log::Report::Message=HASH(0x2871cf8)") at /usr/local/share/perl5/XML/Compile/Translate/Reader.pm line 476
<snip>
XML::Compile::SOAP::Daemon::LWPutil::lwp_run_request("HTTP::Request=HASH(0x2882858)", "CODE(0x231ba38)", "HTTP::Daemon::ClientConn::SSL=GLOB(0x231b9c0)", undef) at /usr/local/share/perl5/XML/Compile/SOAP/Daemon/LWPutil.pm line 95
Any::Daemon::run("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "child_task", "CODE(0x2548128)", "max_childs", 36, "background", 1) at /usr/local/share/perl5/XML/Compile/SOAP/Daemon/AnyDaemon.pm line 75
XML::Compile::SOAP::Daemon::AnyDaemon::_run("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "HASH(0x18dda00)") at /usr/local/share/perl5/XML/Compile/SOAP/Daemon.pm line 99
(eval)("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "HASH(0x18dda00)") at /usr/local/share/perl5/XML/Compile/SOAP/Daemon.pm line 94
XML::Compile::SOAP::Daemon::run("XML::Compile::SOAP::Daemon::AnyDaemon=HASH(0x7a3168)", "name", "rizserver.pl", "background", 1, "max_childs", 36, "socket", [7 more]) at ./rizserver.pl line 95
There is lots of juicy data in those HASH, SCALAR, GLOB, and other elements that I want to get logged; as we are having trouble logging the original request in case it doesn't match.
I've explored using
Some leads that I don't know how to use are using Log::Dispatch, or some sort of Filter on Log::Report; but in the end, all I really want is to apply Data::Dumper to those elements.

How can a circle touch a line in Qbasic and end the program?

I am trying to make a maze in Qbasic but when the pointer touches the maze lines then the program is not ending. I want that when the circle (which is the pointer ) touches the ends of the maze then the program should go to an end.The Program is this:-
cls
screen 12
DIM p AS STRING
DIM A1 AS STRING
15 print"What do you want to do?"
print"A:Draw AN IMAGE"," B:PLAY A MAZE GAME";
PRINT
PRINT"TYPE 'A' OR 'B'IN CAPITAL FORM"
GOTO 102
99 print "rules to play the maze game:"
print
print "1 use 'W' to move the ball foward"
print "2 use 'S' to move the ball backward"
print "3 use 'A' to move the ball leftward"
print "4 use 'D' to move the ball rightward"
INPUT A1
CLS
goto 10
102 INPUT P
if p="A"then
cls
goto 20
elseif p="B" then
cls
goto 99
elseif p<>"A" AND p<>"B" then
print "Choose between A and B"
GOTO 70
end if
10 pset(120,120)
draw "r100"
pset (120,140)
draw"r80"
pset (200,140)
draw "d100"
pset (220,120)
draw"d140"
pset (220,260)
draw "l90"
pset (200,240)
draw "l50"
pset (130,260)
draw"u50l120u90r60u40l50u60r300d90l35d260l60d30l80
h20l20h20u30r40u5l70d60f40r250u90h40u45r40u40r50u130h40r225d65l50d60l15
d130l40d50l20d15r45d40r20u45r10u10r10u90r100"
pset(150,240)
draw"u50l120u50r60u80l50u20r260d50l35d260l60d30l40h20l20h10r
40u50l120d98f50r290u115h40u20r40u40r50u160h10r140d20l50d60l15
d130h40d90l20d60r45d45r70u45r10u10r10u90r75"
20 dim k as string
x = 110
y = 105
do
k = ucase$(inkey$)
if k="W"then
y = y - 2
elseif k= "S" then
y = y + 8
elseif k="A"then
x = x - 8
elseif k="D" then
x = x + 5
end if
circle (x,y),7,10
loop until k ="Q"
GOTO 45
70 CLS
GOTO 15
if x=120 and y=120 then goto 45
40 cls
45 END
Pls Help
Thanks in Advance....

Ok, let's take a peak at your game loop presented below and reformated a bit for readability:
do
k = ucase$(inkey$)
if k="W"then
y = y - 2
elseif k= "S" then
y = y + 8
elseif k="A"then
x = x - 8
elseif k="D" then
x = x + 5
end if
circle (x,y),7,10
loop until k ="Q"
Your win case (if x=120 and y=120 then goto 45) doesn't actually occur within the loop but outside it.
With do loops, only the code between the do and loop statement will execute unless the "until" statement returns true. In order words:
do
'This code will execute
loop until k = "Q"
'This code will only execute after k=Q
Put the win case in the do loop and it should work.
If I recall correctly, QBasic allows whitespace in the beginning of a line. I recommend using whitespace to organize your code visually so you can see what's going on. Look at how I formatted your main loop. Everything that the do loop controls is tabbed to the right of the do and loop statement. This way you can easily see what the do loop is doing. Everything in the if statement gets the same treatment for similar reasons.
If you get in the habit of indenting your code, you can start to see the code's logic laid out cleanly.
Edit: It seems you're new to programming. If you enjoy it, I recommend learning Python through codecademy rather than QBasic. QBasic encourages some very bad habits, like goto statements.

Difference between 2 strings

I want to compare some strings like this
Previous -> Present
Something like
path 1 : 100 -> 112 --> 333 --> 500
path 2 : 100 -> 333 --> 500
path 3 : 100 -> 333 --> 500 --> 500
path 4 : 100 -> 112 --> 500
I need to compare path 1 with path 2, get the number that was in path 1 which doesn't exist in path 2 and store it in a database
Then compare path 2 with path 3 and do same thing. If it already exists then increment it. Otherwise insert the new number.
I know how to insert into a database and increment if the entry exists. What I don't know is how to loop through all those paths getting those values then deciding whether to insert into the database.
I have done some research, and I have heard of Levenshtein Edit Distance but I can't figure out how I should do it.

Your question appears to be:
Given two lists of numbers, how can I tell which ones in list A aren't in list B?
Hashes are useful for doing set arithmetic.
my #a = ( 100, 112, 333, 500 );
my #b = ( 100, 333, 500 );
my %b = map { $_ => 1 } #b;
my #missing = grep { !$b{$_} } #a;

How can I searching for different variants of bioinformatics motifs in string, using Perl?

I have a program output with one tandem repeat in different variants. Is it possible to search (in a string) for the motif and to tell the program to find all variants with maximum "3" mismatches/insertions/deletions?

I will take a crack at this with the very limited information supplied.
First, a short friendly editorial:
<editorial>
Please learn how to ask a good question and how to be precise.
At a minimum, please:
Refrain from domain specific jargon such as "motif" and "tandem repeat" and "base pairs" without providing links or precise definitions;
Say what the goal is and what you have done so far;
Important: Provide clear examples of input and desired output.
It is not helpful to potential helpers on SO have to have to play 20 questions in comments to try and understand your question! I spent more time trying to figure out what you were asking than answering it.
</editorial>
The following program generates a string of 2 character pairs 5,428 pairs long in an array of 1,000 elements long. I realize it is more likely that you will be reading these from a file, but this is just an example. Obviously you would replace the random strings with your actual data from whatever source.
I do not know if 'AT','CG','TC','CA','TG','GC','GG' that I used are legitimate base pair combinations or not. (I slept through biology...) Just edit the map block pairs to legitimate pairs and change the 7 to the number of pairs if you want to generate legitimate random strings for testing.
If the substring at the offset point is 3 differences or less, the array element (a scalar value) is stored in an anonymous array in the value part of a hash. The key part of the hash is the substring that is a near match. Rather than array elements, the values could be file names, Perl data references or other relevant references you want to associate with your motif.
While I have just looked at character by character differences between the strings, you can put any specific logic that you need to look at by replacing the line foreach my $j (0..$#a1) { $diffs++ unless ($a1[$j] eq $a2[$j]); } with the comparison logic that works for your problem. I do not know how mismatches/insertions/deletions are represented in your string, so I leave that as an exercise to the reader. Perhaps Algorithm::Diff or String::Diff from CPAN?
It is easy to modify this program to have keyboard input for $target and $offset or have the string searched beginning to end rather than several strings at a fixed offset. Once again: it was not really clear what your goal is...
use strict; use warnings;
my #bps;
push(#bps,join('',map { ('AT','CG','TC','CA','TG','GC','GG')[rand 7] }
0..5428)) for(1..1_000);
my $len=length($bps[0]);
my $s_count= scalar #bps;
print "$s_count random strings generated $len characters long\n" ;
my $target="CGTCGCACAG";
my $offset=832;
my $nlen=length $target;
my %HoA;
my $diffs=0;
my #a2=split(//, $target);
substr($bps[-1], $offset, $nlen)=$target; #guarantee 1 match
substr($bps[-2], $offset, $nlen)="CATGGCACGG"; #anja example
foreach my $i (0..$#bps) {
my $cand=substr($bps[$i], $offset, $nlen);
my #a1=split(//, $cand);
$diffs=0;
foreach my $j (0..$#a1) { $diffs++ unless ($a1[$j] eq $a2[$j]); }
next if $diffs > 3;
push (#{$HoA{$cand}}, $i);
}
foreach my $hit (keys %HoA) {
my #a1=split(//, $hit);
$diffs=0;
my $ds="";
foreach my $j (0..$#a1) {
if($a1[$j] eq $a2[$j]) {
$ds.=" ";
} else {
$diffs++;
$ds.=$a1[$j];
}
}
print "Target: $target\n",
"Candidate: $hit\n",
"Differences: $ds $diffs differences\n",
"Array element: ";
foreach (#{$HoA{$hit}}) {
print "$_ " ;
}
print "\n\n";
}
Output:
1000 random strings generated 10858 characters long
Target: CGTCGCACAG
Candidate: CGTCGCACAG
Differences: 0 differences
Array element: 999
Target: CGTCGCACAG
Candidate: CGTCGCCGCG
Differences: CGC 3 differences
Array element: 696
Target: CGTCGCACAG
Candidate: CGTCGCCGAT
Differences: CG T 3 differences
Array element: 851
Target: CGTCGCACAG
Candidate: CGTCGCATGG
Differences: TG 2 differences
Array element: 986
Target: CGTCGCACAG
Candidate: CATGGCACGG
Differences: A G G 3 differences
Array element: 998
..several cut out..
Target: CGTCGCACAG
Candidate: CGTCGCTCCA
Differences: T CA 3 differences
Array element: 568 926

I believe that there are routines for this sort of thing in BioPerl.
In any case, you might get better answers if you asked this over at BioStar, the bioinformatics stack exchange.

When I was in my first couple years of learning perl, I wrote what I now consider to be a very inefficient (but functional) tandem repeat finder (which used to be available on my old job's company website) called tandyman. I wrote a fuzzy version of it a couple years later called cottonTandy. If I were to re-write it today, I would use hashes for a global search (given the allowed mistakes) and utilize pattern matching for a local search.
Here's an example of how you use it:
#!/usr/bin/perl
use Tandyman;
$sequence = "ATGCATCGTAGCGTTCAGTCGGCATCTATCTGACGTACTCTTACTGCATGAGTCTAGCTGTACTACGTACGAGCTGAGCAGCGTACgTG";
my $tandy = Tandyman->new(\$sequence,'n'); #Can't believe I coded it to take a scalar reference! Prob. fresh out of a cpp class when I wrote it.
$tandy->SetParams(4,2,3,3,4);
#The parameters are, in order:
# repeat unit size
# min number of repeat units to require a hit
# allowed mistakes per unit (an upper bound for "mistake concentration")
# allowed mistakes per window (a lower bound for "mistake concentration")
# number of units in a "window"
while(#repeat_info = $tandy->FindRepeat())
{print(join("\t",#repeat_info),"\n")}
The output of this test looks like this (and takes a horrendous 11 seconds to run):
25 32 TCTA 2 0.87 TCTA TCTG
58 72 CGTA 4 0.81 CTGTA CTA CGTA CGA
82 89 CGTA 2 0.87 CGTA CGTG
45 51 TGCA 2 0.87 TGCA TGA
65 72 ACGA 2 0.87 ACGT ACGA
23 29 CTAT 2 0.87 CAT CTAT
36 45 TACT 3 0.83 TACT CT TACT
24 31 ATCT 2 1 ATCT ATCT
51 59 AGCT 2 0.87 AGTCT AGCT
33 39 ACGT 2 0.87 ACGT ACT
62 72 ACGT 3 0.83 ACT ACGT ACGA
80 88 ACGT 2 0.87 AGCGT ACGT
81 88 GCGT 2 0.87 GCGT ACGT
63 70 CTAC 2 0.87 CTAC GTAC
32 38 GTAC 2 0.87 GAC GTAC
60 74 GTAC 4 0.81 GTAC TAC GTAC GAGC
23 30 CATC 2 0.87 CATC TATC
71 82 GAGC 3 0.83 GAGC TGAGC AGC
1 7 ATGC 2 0.87 ATGC ATC
54 60 CTAG 2 0.87 CTAG CTG
15 22 TCAG 2 0.87 TCAG TCGG
70 81 CGAG 3 0.83 CGAG CTGAG CAG
44 50 CATG 2 0.87 CTG CATG
25 32 TCTG 2 0.87 TCTA TCTG
82 89 CGTG 2 0.87 CGTA CGTG
55 73 TACG 5 0.75 TAGCTG TAC TACG TACG AG
69 83 AGCG 4 0.81 ACG AGCTG AGC AGCG
15 22 TCGG 2 0.87 TCAG TCGG
As you can see, it allows indels and SNPs. The columns are, in order:
Start position
Stop position
Consensus sequence
The number of units found
A quality metric out of 1
The repeat units separated by spaces
Note, that it's easy to supply parameters (as you can see from the output above) that will output junk/insignificant "repeats", but if you know how to supply good params, it can find what you set it upon finding.
Unfortunately, the package is not publicly available. I never bothered to make it available since it's so slow and not amenable to even prokaryotic-sized genome searches (though it would be workable for individual genes). In my novice coding days, I had started to add a feature to take a "state" as input so that I could run it on sections of a sequence in parallel and I never finished that once I learned hashes would make it so much faster. By that point, I had moved on to other projects. But if it would suit your needs, message me, I can email you a copy.
It's just shy of 1000 lines of code, but it has lots of bells & whistles, such as the allowance of IUPAC ambiguity codes (BDHVRYKMSWN). It works for both amino acids and nucleic acids. It filters out internal repeats (e.g. does not report TTTT or ATAT as 4nt consensuses).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Example of 'a subroutine may have several entry and exit points' - subroutine

Related

How does mro, goto, and set_subname interact?

Displaying human-readable text in perl Log::Report stack traces

How can a circle touch a line in Qbasic and end the program?

Difference between 2 strings

How can I searching for different variants of bioinformatics motifs in string, using Perl?

Categories

Resources