Reverse Engineering a Perl script based on a core dump - perl

A friend's server (yes, really. Not mine.) was broken into and we discovered a perl binary running some bot code. We could not find the script itself (probably eval'ed as received over the network), but we managed to create a core dump of the perl process.
Running strings on the core gave us some hints (hostnames, usernames / passwords), but not the source code of the script.
We'd like to know what the script was capable of doing, so we'd like to reverse-engineer the perl code that was running inside that perl interpreter.
Searching around, the closest thing to a perl de-compiler I found is the B::Deparse module which seems to be perfectly suitable for converting the bytecode of the parse-trees back into readable code.
Now, how do I get B::Deparse to operate on a core dump? Or, alternatively, how could I restart the program from the core, load B::Deparse and execute it?
Any ideas are welcome.

ysth asked me on IRC to comment on your question. I've done a whole
pile of stuff "disassembling" compiled perl and stuff (just see my
CPAN page [http://search.cpan.org/~jjore]).
Perl compiles your source to a tree of OP* structs which
occasionally have C pointers to SV* which are perl values. Your core
dump now has a bunch of those OP* and SV* stashed.
The best possible world would be to have a perl module like
B::Deparse do the information-understanding work for you. It
works by using a light interface to perl memory in the B::OP and
B::SV classes (documented in B, perlguts, and
perlhack). This is unrealistic for you because a B::* object is
just a pointer into memory with accessors to decode the struct for our
use. Consider:
require Data::Dumper;
require Scalar::Util;
require B;
my $value = 'this is a string';
my $sv = B::svref_2object( \ $value );
my $address = Scalar::Util::refaddr( \ $value );
local $Data::Dumper::Sortkeys = 1;
local $Data::Dumper::Purity = 1;
print Data::Dumper::Dumper(
{
address => $address,
value => \ $value,
sv => $sv,
sv_attr => {
CUR => $sv->CUR,
LEN => $sv->LEN,
PV => $sv->PV,
PVBM => $sv->PVBM,
PVX => $sv->PVX,
as_string => $sv->as_string,
FLAGS => $sv->FLAGS,
MAGICAL => $sv->MAGICAL,
POK => $sv->POK,
REFCNT => $sv->REFCNT,
ROK => $sv->ROK,
SvTYPE => $sv->SvTYPE,
object_2svref => $sv->object_2svref,
},
}
);
which when run showed that the B::PV object (it is ISA B::SV) is
truely merely an interface to the memory representation of the
compiled string this is a string.
$VAR1 = {
'address' => 438506984,
'sv' => bless( do{\(my $o = 438506984)}, 'B::PV' ),
'sv_attr' => {
'CUR' => 16,
'FLAGS' => 279557,
'LEN' => 24,
'MAGICAL' => 0,
'POK' => 1024,
'PV' => 'this is a string',
'PVBM' => 'this is a string',
'PVX' => 'this is a string',
'REFCNT' => 2,
'ROK' => 0,
'SvTYPE' => 5,
'as_string' => 'this is a string',
'object_2svref' => \'this is a string'
},
'value' => do{my $o}
};
$VAR1->{'value'} = $VAR1->{'sv_attr'}{'object_2svref'};
This however implies that any B::* using code must actually operate
on live memory. Tye McQueen thought he remembered a C debugger which
could fully revive a working process given a core dump. My gdb
can't. gdb can allow you to dump the contents of your OP* and
SV* structs. You would most likely just read the dumped structs to
interpret your program's structure. You could, if you wished, use
gdb to dump the structs, then synthetically create B::* objects
which behaved in interface as if they were ordinary and use
B::Deparse on that. At root, our deparser and other debug dumping
tools are mostly object oriented so you could just "fool" them by
creating a pile of fake B::* classes and objects.
You may find reading the B::Deparse class's coderef2text method
instructive. It accepts a function reference, casts it to a B::CV
object, and uses that for input to the deparse_sub method:
require B;
require B::Deparse;
sub your_function { ... }
my $cv = B::svref_2object( \ &your_function );
my $deparser = B::Deparse->new;
print $deparser->deparse_sub( $cv );
For gentler introductions to OP* and related ideas, see the updated
PerlGuts Illustrated and Optree guts.

I doubt there's a tool out there that does this out of the box, so...
Find the source code to the version of perl you were running. This should help you understand the memory layout of the perl interpreter. It will also help you figure out if there's a way to take a shortcut here (e.g. if bytecode is preceded by an easy to find header in memory or something).
Load up the binary + core dump in a debugger, probably gdb
Use the information in the perl source code to guide you in convincing the debugger to spit out the bytecode you're interested in.
Once you have the bytecode, B::Deparse should be able to get you to something more readable.

Well, undump will turn that core dump back into a binary executable (if you can find a working version). You should then be able to load that into perl and -MO=Deparse it.

Related

Autoload'd calls fail via Inline::Perl5 in Raku

I'm rewriting some perl/charting software in Raku but have run into an issue using the ChartDirector perl5 module (link below) via Inline::Perl5. The module is basically a perl interface to a DLL; using the module via Inline::Perl5 seems to work for method calls explicitly included in the code - but most of the method calls are executed via the autoload 'catch all' mechanism in perl5. These do not work in my raku code.
My question is can I expect this sort of application to work using Inline::Perl5? (perhaps there is no way to 'catch' these autoload'd method calls) and, if so, how to make it so.
Thanks for any pointers, suggestions.
wf
ChartDirector software (excellent graphics/charting software - have used it for nearly 2 decades with perl): https://www.advsofteng.com/index.html
Quoting verbiage (simplified), version info, and code from the start of a chartdir forum thread about this:
I'm using Inline::Perl5 which works with every other module I've tried.
I'm hitting problems reproducing the "first project" example on the chartdir site.
using chartdir 6, freebsd 12.2 (intel platform), raku 2022.04, perl 5.32.1.
#!/usr/bin/env raku
use lib:from<Perl5> '/usr/local/lib/perl5/site_perl';
use Inline::Perl5;
my $p5 = Inline::Perl5.new;
$p5.use('perlchartdir') ;
my $data = [85, 156, 179.5, 211, 123];
my $labels = ["Mon", "Tue", "Wed", "Thu", "Fri"];
my $c = $p5.invoke( 'XYChart', 'new', 250, 250);
$c.setPlotArea(30, 20, 200, 200);
$c.addBarLayer($data);
$c.xAxis().setLabels($labels);
$c.xAxis().setLabels($labels);
$c.makeChart("simplebar.png");
All seems fine (i.e., data-dumping $c after line 8 shows a large/reasonable looking structure) until line 9 where I receive "No such method setPlotArea for invocant of type XYChart". Line 10 does appear to work (no complaints) - remaining 3 lines don't (same type of error as for line 8).
And quoting some feedback that was given by Peter Kwan, the primary dev of chartdir:
I have never used Raku before. For your case, the methods that fail seem to be AUTOLOAD methods. ... I suspect may be Raku does not support Perl AUTOLOAD, so it reports undefined methods as not found instead of forwarding it to the "catch all" method. Or may be some additional things need to be imported for it to use the AUTOLOAD.
As dwarring notes in the comments below this SO question, Inline::Perl5 does support autoload (and has for 7 years), so perhaps that aspect is a red herring?
gratefully responding to p6steve's response, I am providing some additional information ..
The full output from various representations of $c (the XYChart object) is included here: https://pastebin.com/gY2ibDaM (hope it's ok to use pastebin for this (still finding my way through stackoverflow)) - the output was 600+ lines long and wasn't sure what I could usefully edit out).
to summarize though ..
dd $c returns nil (although prints out the equivalent of $c.perl (below) to stdout (not sure why))
say $c.perl returns:
XYChart.new(inline-perl5 => Inline::Perl5.new(thread-id => 1), wrapped-perl5-object => NativeCall::Types::Pointer.new(34651088744))
say $c.^methods returns:
(WHERE addHLOCLayer ACCEPTS WHY can new isa rakuseen defined getYCoor addHLOCLayer3 raku yZoneColor Numeric addLineLayer addAreaLayer DESTROY BUILDALL gist perl WHICH sink bless getYValue Str addBarLayer new_shadow_of_p5_object AT-KEY)
finally, say Dump $c (using Data::Dump module) yields about 600 lines of output (included in the pastebin output).
Hi wingfold and welcome to the raku SO tag!
I wonder what you get with dd $c; just before the line $c.setPlotArea(30, 20, 200, 200); - e.g. is $c truly an XYChart object?
If so then what does $c.^methods (the '^' indicates a meta method ... in this case you should get a list of the available methods).
Please do post the results here and hopefully that will help the diagnosis...
Thanks for the info!
Having seen the output of the $c.^methods call, it is clear that $c has no method $c.setPlotArea (reading the error message says the same - perhaps I should have given that due weight!)
I do not know the Inline::Perl5 module well, but I have seen similar issues with Inline::Python.
My experience in Python is that the target language objects only expose their "direct" methods and do not automatically pull in all the composed methods that they can perform.
My workaround has been on the lines of an "eval" style approach, something like:
$p5.run( qq[$c.setPlotArea(30, 20, 200, 200);] );
Hope this helps!

How can I get a better error message if a required attribute is not supplied in Moose?

I'm brand new to Moose. Up until today our environments have been on Perl 5.8.2 which would not support Moose.
I'm working through some examples, and I thought that the "required => 1" setting on an attribute would be handy, however when I try using that option, the error message that is returned is not really usable.
Here's an example:
cat.pl:
#!/usr/bin/perl
{
package Cat;
use Moose;
use Modern::Perl;
has 'name' => (
is => 'ro',
required => 1,
);
sub meow {
my $self = shift;
say 'Meow!';
}
}
use Modern::Perl;
my $alarm = Cat->new();
$alarm->meow();
$alarm->meow();
$alarm->meow();
When I run it:
Attribute (name) is required at /app/perl5/perl-5.10.1/lib/site_perl/5.10.1/aix-thread-multi-64all/Class/MOP/Class.pm line 581
Class::MOP::Class::_construct_instance('Moose::Meta::Class=HASH(0x110ac1a00)', 'HASH(0x110c3b3c0)') called at /app/perl5/perl-5.10.1/lib/site_perl/5.10.1/aix-thread-multi-64all/Class/MOP/Class.pm line 554
Class::MOP::Class::new_object('Moose::Meta::Class=HASH(0x110ac1a00)', 'HASH(0x110c3b3c0)') called at /app/perl5/perl-5.10.1/lib/site_perl/5.10.1/aix-thread-multi-64all/Moose/Meta/Class.pm line 258
Moose::Meta::Class::new_object('Moose::Meta::Class=HASH(0x110ac1a00)', 'HASH(0x110c3b3c0)') called at /app/perl5/perl-5.10.1/lib/site_perl/5.10.1/aix-thread-multi-64all/Moose/Object.pm line 28
Moose::Object::new('Cat') called at cat.pl line 20
If one of our non-perl operators see an error message like that, they will probably freak out. I'm afraid they will not realize that all 5 lines in the error message are actually a part of the same error.
Is there a way to get a nice error message if a required attribute is not supplied?
Something like croak, I can imagine a message like this:
Attribute (name) is required at cat.pl line 20
Again, I'm new to Moose so this may be an easy setting that I am missing.
Thanks in advance!
I think I may have found a solution to my requirement, but I'm not sure if it is the best solution.
And, as #Tanktalus points out, there is value to having a detailed error message.
For the purposes of my question, the MooseX::Constructor::AllErrors extension seems to work:
#!/usr/bin/perl
{
package Cat;
use Moose;
use MooseX::Constructor::AllErrors;
use Modern::Perl;
has 'name' => (
is => 'ro',
required => 1,
);
sub meow {
my $self = shift;
say 'Meow!';
}
}
use Modern::Perl;
my $alarm = Cat->new();
$alarm->meow();
$alarm->meow();
$alarm->meow();
On running, I get:
Attribute (name) is required at cat.pl line 21
Which is what I was thinking.
Since I am not familiar at all with Moose, is this extension okay to use, or will it "muffle" all of the error messages?
I prefer the long error message - because if I'm missing a required parameter, I want my non-perl users to freak out: I obviously missed something in development, unit-test, and system test if it gets all the way to a user with this type of message.
Mind you, I also leave fatal warnings turned on when I go to production. I prefer my failures to be spectacular so that I can't accidentally ignore them.
Since "name" is required, what you need to do is populate this attribute from the constructor. Like this:
my $alarm = Cat->new({'name' => 'Sylvester'});
This should fix your problem.

Perl unit testing deep structures

TMTOWTDI, well I sure hope so - I've been using Test::Deep for my last few projects whenever I come across multidimensional hashes (sometimes 4-5 levels deep). Right now my usual practice is just typing out these hashes and filling in my expected data then running cmp_deeply(actual, expected, msg). Does anyone have any advice on unit testing deep nested data like this? - my current method seems grossly inefficient, sometimes taking over a hour per unit test of a data structure.
I'm using Test::More and is_deeply like the next commenter. I have also somewhat automated comparison by having a t/sample/ directory in which I store the expected values by running the tests by hand and using Data::Dumper where appropriate. (Oh I'd better have gone with YAML -- TMTOWTDI!)
I.e. you run the dump-generator once, review the dumped structures, commit them, and then just rely on is_deeply until tests break which means either a bug, or an intended structure change.
I am usually fine with is_deeply from Test::More for this purpose. I also used Test::Differences for some projects - it seems to have better output on non-equality.
If you just like to test the structure, try Data::Validate::Struct.
You need to define a reference data structure to compare against the actual data structure that can be any level deep, any combination of nested hashes/arrays.
Most important to note here, this Module validates the data type, not the actual data.
Ex:
my $exp = {
'body' => {
'results' => [
{
'template' => [
{
'origin' => 'word',
'name' => 'word',
'id' => 'int',
'attributes' => [
{
'value' => 'number',
'key' => 'word'
},
{
'value' => 'text' ,
'key' => 'word'
}
],
'hostname' => 'hostname',
}
]
}
]
}
};
and then
my $v = new_ok('Data::Validate::Struct', [ $exp ]);
ok($v->validate($act), 'validate an exp against an act' . $v->errstr());
would validate the structure of $act against $exp
Try to have a look at Tests on the Module, i found it very useful.
Perhaps the OP meant the time is consumed by manually typing in the expected values, not by the run time. If you have a known good structure, you can use Data::Dumper or Data::Dump to generate programmatic text that can then be incorporated into your tests.

How can I get Perl's Getopt::Long to tell if arguments are missing?

I'm using Perl's Getopt::Long module to parse command line arguments. However, it seems that it returns a true value even if some of the arguments are missing. Is there a way to tell if this is the case?
In plain old Getopt::Long, you can't do this directly -- as Jonathan said, you need to check your requirements for undef. However, IMHO this is a good thing -- what is a "required" parameter? Often one has parameters that are required in one case and not another -- the most common example here being the sore thumb of the --help option. It's not required, and if the user uses it, he probably doesn't know to or won't pass any of the other "required" parameters.
I use this idiom in some of my code (well, I used to, until I switched to using MooseX::Getopt):
use List:MoreUtils 'all';
Getopt::Long::GetOptions(\%options, #opt_spec);
print usage(), exit if $options{help};
die usage() unless all { defined $options{$_} } #required_options;
Even with MooseX::Getopt I don't set my attributes to required => 1, again because of the --help option. Instead I check for the presence of all attributes I need before moving into the main body of program execution.
package MyApp::Prog;
use Moose;
with 'MooseX::Getopt';
has foo => (
is => 'ro', isa => 'Str',
documentation => 'Provides the foo for the frobnitz',
);
has bar => (
is => 'ro', isa => 'Int',
documentation => 'Quantity of bar furbles to use when creating the frobnitz',
);
# run just after startup; use to verify system, initialize DB etc.
sub setup
{
my $this = shift;
die "Required option foo!\n" unless $this->foo;
die "Required option bar!\n" unless $this->bar;
# ...
}
Options are optional, hence the name 'Getopt'.
You check the option values that are set by Getopt::Long; if one of the crucial ones is 'undef', it was missed and you can identify it.
The return value tells you that there were no horrible blunders in the command line. What constitutes a blunder depends on how you use Getopt::Long, but a classic one would be that the command line contains -o output but the command does not recognize a -o option.

How much do I need to test Moose- and MooseX::FollowPBP-generated methods?

I want to start strictly doing Test-Driven-Development. However, I was wondering how much I should test methods generated by Moose and MooseX::FollowPBP. For example, I have the following class:
package Neu::Series;
use Moose;
use MooseX::FollowPBP;
use File::Find::Wanted;
has 'file_regex' => (
isa=>'RegexpRef',
is=>'rw',
default => sub{
qr{
[A-Z] #Uppercase letter
[a-zA-Z]* #any letter, any number of times
[-] #dash
( #open capturing parenthesis
[0-9]
[0-9]
[0-9]
[0-9]
[a-zA-Z]? #any letter, optional
) #close capturing parenthesis
}xms;
},
);
has 'top_dir' => (
isa=>'Str',
is=>'rw',
);
has 'access' =>(
isa=>'Neu::Access',
is=>'ro',
required=>1,
);
1;
My current test script is:
use strict;
use warnings;
use Test::More tests => 8;
use Neu::Access;
BEGIN{ use_ok('Neu::Series'); }
can_ok( 'Neu::Series', 'new');
can_ok( 'Neu::Series', 'set_file_regex');
can_ok( 'Neu::Series', 'get_file_regex');
can_ok( 'Neu::Series', 'set_top_dir');
can_ok( 'Neu::Series', 'get_top_dir');
can_ok( 'Neu::Series', 'get_access');
my $access = Neu::Access->new(dsn => 'test');
my $series_worker = Neu::Series->new(access => $access);
isa_ok($series_worker, 'Neu::Series');
Is this enough or too-much testing? (That is, besides the obviously missing tests for the regex).
I thought I saw a web page or another post about this somewhere, but I haven't been able to find it today.
There's really no point in testing that the accessors were generated correctly. If they're not, you'll find out very quickly, because any real tests you write will fail.
Moose itself tests that accessors are generated correctly, that Moose-using classes get a constructor, and so on. One of the points of using dependencies is so that you can focus on writing and testing your application, not helper code.
I do agree with daotoad, it's probably worth testing constraints and coercions that you write yourself.
Checking that all accessors were generated correctly is fine... however there are other things you could test at a slight higher level, e.g. why not test that the attributes were generated properly?
use Test::Deep;
my #attrs = Neu::Series->meta->get_all_attributes;
cmp_deeply( [ map { $_->name } #attrs ], superbagof(qw(file_regex top_dir access)));
I'd focus on testing my specification. Did I tell Moose what I wanted it to do correctly?
To this end, I'd start with the following tests:
Verify that read/write attributes have both an accessor and a mutator.
Verify that read only attributes have an accessor and no mutator.
Test any type constraints and coercions. Verify that only acceptable values can be set. If an attribute sIf you expect VII to be seen as a Str and coerced into an Int as 7, test that it does.
Thank you Dave, daotoad, Ether, Elliot, and brian. From reading your answers, comments, and blogs, there seem to be two salient points:
(1) No testing is needed to make sure Moose does what it is supposed to do. I think everyone agrees on this point.
(2) Testing Moose-generated methods is appropriate for establishing, testing, and maintaining your interface. Most agree on this point.
Again, thanks to everyone for your input.
EDIT:
This is just a community-wiki summary. Please read the individual answers and comments.