How can I hide Perl code? - perl

I've written some Perl programs and am planning on distributing them. They're part of a large binary distribution (mostly compiled C/C++). If possible, I'd prefer to give up as little as possible (I'm responsible for delivering working software, not delivering clever algorithms). What is my best bet for hiding the Perl code so that if someone really wants to see the source, they'd have to put a bit more effort than in than simply opening the file in an editor?

You could encrypt your code and then at run time decrypt it and send it to perl stdin. (of course the decryptor would not be encrypted).
I got some minify/compile answers to my question How can I compile my Perl script so to reduce startup time?

Acme::Bleach

Filter::Crypto (potentially via PAR::Filter::Crypto) is clearly the most advanced open source tool for this job (barring perlcc which doesn't work well for many things, YMMV).
If all you want is hide the code from casual tinkerers, that's more than sufficient. Hiding it from determined and/or capable people is practically impossible.

It won't make it harder to just open the files but an obfuscator can make it more difficult to understand and modify your code. Have a look here or here for a start.

Related

REST Server in TCL

I would like to add a REST interface to an existing TCL codebase (so that the programms in other language can use the existing TCL code).
I found a list of Webserver with TCL support but I have no idea which one would be a good solution to quickly map our TCL functions to HTTP/REST calls without tons of boilerplate code.
Has anyone here already done something like this and can tell me which of these servers would be a good (or bad/difficult) solution?
Is there maybe another server/framework that is even better for this use case?
Consider Naviserver. Tcl is its embedded interpreter language. It has a low profile memory overhead, and is regularly maintained and tested for performance and low latency.
For what you’re describing, you might consider Wapp. It’ll do exactly the boilerplate elimination you want, and it’s easy to dive into. You’d probably want to use it as a library, rather than an app, given that you’ve got an existing codebase, but its operation past the initial setup is the same for that use case.

Best ways to become familiar with a Perl codebase?

I recently joined a Perl project and I need to start being productive with the codebase fairly quickly. However, I'm finding that I'm getting stuck because I don't know where I need to change or how all the parts of the code fit together.
What are your tips and tools for becoming familiar with a Perl codebase that you have no experience with?
(Note: I realize that there's already a similar question. I'm wondering if there's any Perl-specific strategies.)
First, if the previous maintainers were doing their job well, you should have an extensive test suite and perldoc documentation for each module and script in the codebase. If so, read through the perldoc, and read through the tests. The perldoc should give you an overview of what things do, and the test suite will give you examples of the code being used in context.
Depending on the author, the internal comments may be useful in understanding the intention of the code, so looking through the actual source my provide insights into algorithms, bugs, and intended use as well.
If you don't have any of these, proceed as you would for any badly-maintained codebase: start small, writing programs that try to use the code, and use Test::More and the like to start turning these into a test suite.
In the first case, you may find it to be very simple, in the second, very hard. Peter Scott's Perl Medic can be very useful in assisting you in turning such a codebase into something usable and useful if you're stuck with the second case, and Mike Thomsen's recommendation of Effective Perl Programming is also a good one.
I work on Melody which is written primarily in Perl. It's a rather large code base, and I've found the process of learning the Melody code base is identical to any Java system I've worked on.
It really comes down to just working with it, googling when you see behavior you've never seen before and experimentation.
This book is a great reference for picking up Perl in a serious way. It's not very dense and it will teach you a lot about proper Perl development.
Besides the "similar question", http://perldoc.perl.org/ and an empty test.pl file is a good starting point!
I would like to see a real answer here. The only thing I have is more questions (you don't have to provide answers here, just ask yourself):
Goal
What is the goal of the project, what it is supposed to do?
Who knows the workflow?
Environment
Can you set up the project in a clean test environment?
Does it use a versioning system?
Where are the entry points (i.e. executable files)?
Does it rely on external programs?
Does it require additional system tweaks (i.e. cron scripts)?
Perl code
Is your project using strict and warnings everywhere?
Which CPAN modules are used?
Are there any frameworks used (Moose, Catalyst, probably some ORM, ...)?
Are there any perldocs in the project's modules?
Are there any tests (notably t/*.t)?
I usually start with working on some simple bug report or a simple feature I want to add. While working on code I write comments for code and commit them. Writing tests also helps.

Is this worth doing for practice and learning, without modules (Perl)

looking into connecting to a secure ftp site (using perl), and downloading all the .log files, saving in new directories named after the day I downloaded the files. I want to do this without modules, as a learning experience, but before I start I wanted to know if you guys thought it was doing, or is way too much for a relatively new programmer and I should just learn the modules?
If it's production work, no, use the modules. Your implementation will be buggy, missing features and unknown to the next person maintaining that code.
Otherwise, yes. It's good to learn the principles of a network protocol. I do have a reservation about FTP as it is a bit baroque, insecure, inefficient and on its way out. scp, HTTP or rsync would be more useful to put your energy into.
I'd start with reading the RFC and putting together your own FTP module using just network sockets. Document and test it as if you were going to release to CPAN as a full learning exercise in making a network module. Run it against some various FTP server implementations as they often interpret the spec differently (or not at all). Don't be afraid to cheat and look at what the existing modules do. Who knows, you might write something better than what's already there.
Learning the principals, just like we did at school for long multiplication and division, means we know how things work when we use a short hand.
However, when new to the world,just like when you learn to speak, you did "A is for Apple" etc, you didnt get explained about the finesse of grammar and all that, you learnt to express yourself enough to be understood.
Programming is a little like the same. While in an ideal world you can easily argue a prewritten generic library is often way less efficient than a specifically targeted set of routines. If the wheel you are using was already invented, it seems a lot of work to make a new one.
So, use the wheels and cogs afailable, once you really have the hang of it, NOW look at inventing your own more efficient ones.
Ad cpan modules:
Modules are an great learning source. Here is zilion modules and you can really learn much studying some of them.
And when/while you mastering your perl, you will start writing you own modules. When your program will use modules anyway (yours one), you can ask - why don't use modules already developed and debugged?
So, learn perl basics, study some modules (for example Net::SFTP) and if you still want write your own solution - it is up to you. :)
'

Is Perl a good option for heavy text-processing?

I have this web application which needs to do several heavy text processing tasks: removing certain characters, parsing XML files, among others. Some of them involve regular expressions.
The web application has some implementations in Java and others in PHP. Is it worth using Perl or other specific text processing language for such tasks, or is there really no difference with using PHP?
I even thought of using Sed, Awk maybe even some compiled C scripts for processing texts. There's a lot of text to be processed...
Yes, Perl is a good option. As a language, it's definitely more suitable for those kinds of tasks than Java or PHP. If you have the Perl knowledge, I would recommend it for this kind of task.
I too suggest you use Perl, it's made for text crunching.
However, if you are going to parse/process XML, please don't try to roll your own solution, there are several high quality modules that do the job correctly. As a starter, I recommend you take a look at XML::Twig
Also, for regular expressions, there are dozens of already-made ones under the Regexp::Common distribution. Most probably you'll find what you need there and it will save you time.
Perl is THE language for text processsing. It was designed with this in mind.
Text processing is exactly what Perl was created for. After all it's Practical Extraction and Report Language. On the other hand, for web application I'd prefer Python.
Yes, Perl was designed with processing text in mind.
It has tons of useful text processing features, and it was the first language I used (long ago) that had regular expressions.
http://en.wikipedia.org/wiki/Perl
Yes. Text processing is PERL's #1 strong point. Since you will integrate into your existing app, you'll need to execute an external program so think about how to run it securely and perhaps as a background process (to avoid start up delays in your real time web app.)

Is there a good obfuscater for Perl code?

Does anyone know of a good code obsfucator for Perl? I'm being ask to look into the option of obsfucating code before releasing it to a client. I know obsfucated code can still be reverse engineered, but that's not our main concern.
Some clients are making small changes to the source code that we give them and it's giving us nightmares when something goes wrong and we have to fix it, or when we release a patch that doesn't work with what they've changed. So the intention is just to make it so that it's difficult for them to make their own changes to the code(they're not supposed to be doing that anyway).
I've been down this road before and it's an absolute nightmare when you have to work on "obfuscated" code because it drives up costs tremendously trying to debug a problem on the client's server when you, the developer, can't read the code. You wind up with "deobfuscators", copying the "real code" to the client's server or any of a number of other issues which just become a real hassle to maintain.
I understand where you're coming from, but it sounds like management has a problem and they're looking to you to implement a chosen solution rather than figuring out what the correct solution is.
In this case, it sounds like it's really a licensing or contractual issue. Let 'em have the code open source, but make it a part of the license that any changes they submit have to come back to you and be approved. When you push out patches, check the md5 sums of all code and if it doesn't match what's expected, they're in license violation and will be charged accordingly (and it should be a far, far higher rate). (I remember one company which let us have the code open source, but made it clear that if we changed anything, we've "bought" the code for $25,000 and they were no longer responsible for any bug fixes or upgrades unless we bought a new license).
Don't. Just don't.
Write it into the contract (or revise the contract if you have to), that you are not responsible for changes they make to the software. If they're f-ing up your code and then expecting you to fix it, you have client problems that aren't going to be solved by obfuscating the code. And if you obfuscate it and they encounter an actual problem, good luck in getting them to accurately report line number, etc., in the bug report.
Please don't do that. If you don't want people to alter your Perl code then put it under an appropriate licence and enforce that licence. If people change your code when you licence says that they shouldn't do that, then it's not your problem when your updates not longer work with their installation.
See perlfaq3's answer to "How Can I hide the source for my Perl programs? for more details.
It would seem your main issue is clients modifying code which then makes it difficult for you to support it. I would suggest you ask for checksums (md5,sha, etc) of their files when they come to you for support, and similarly check files' checksums when patching. For example, you can ask the client to provide the output of a provided program which goes through their install and checksums all the files.
Ultimately they have the code, so they can do whatever they want to it. The best you can do is enforce your licenses and to make sure you only support unmodified code.
In this case obfuscating is the wrong approach.
When you release the code to the client you should keep a copy of the code you send them (either on disk or preferably in your version control as a tag/branch).
Then if your client makes changes you can compare the code they have to the code you sent them and easily spot the changes. After all if they feel the need to make changes there is a problem somewhere and you should fix it in the master codebase.
Another alternative for converting your program into a binary is the free PAR-Packer tool on CPAN. There are even filters for code obfuscation, though as others have said, that's possibly more trouble than it's worth.
I agree with the previous suggestions.
However if you really want to, you can look into PAR and/or Filter::Crypto CPAN modules. You can also use them together.
I used the latter (Filter::Crypto) as a really lightweight form of "protection" when we were shipping our product on optical media. It doesn't "protect" you, but it will stop 90% of the people that want to modify your source files.
This isn't a serious suggestion, however take a look at Acme::Buffy.
It will at least brighten your day!
An alternative to obfuscation is converting your script to a binary using something like ActiveState's Perl Dev Kit.
I am running a Windows O/S and use perl2exe from IndigoSTAR. The resulting .EXE file will be unlikely to be changed on-site.
As others have said, "how do I obfuscate it" is the wrong question. "How do I stop the customer from changing the code" is the right one.
The checksum and contract ideas are good for preventing the "problems" you describe, but if the cost to you is the difficulty of rolling-out upgrades and bug-fixes, how are your clients making changes that don't pass the comprehensive test suite? If they are capable of making these changes (or at least, making a change which expresses what they want the code to do), why not simply make it easy/automated for them to open a support ticket and upload the patch? The customer is always right about what the customer wants (they might not have a clue how to do it "the right way", but that's why they are paying you.)
A better reason to want an obfuscator would be for mass-market desktop deployment where you don't have every customer on a standing contract. In that case, something like PAR -- anything which packs the encryption/obfuscation logic into a compiled binary is the way to go.
As several folks have already said: don't.
It's pretty much implicit, given the nature of the Perl interpreter, that anything you do to obfuscate the Perl must be undoable before Perl gets its hands on it, which means you need to leave the de-obfuscation script/binary lying around where the interpreter (and thus your customer) can find it :)
Fix the real problem: checksums and/or a suitably worded license. And support staff trained to say 'you changed it? we're invoking clause 34b of our license, and that'll be $X,000 before we touch it'....
Also, read why-should-i-use-obfuscation for a more general answer.
I would just invite them into my SVN tree on their own branch so they can provide changes and I can see them and integrate their changes into my development tree.
Don't fight it, embrace it.
As Ovid says, it's a contractual, social problem. If they change the code, they invalidate the warranty. Charge them a lot to fix that, but at the same time, give them a channel where they can suggest changes. Also, look at what they want to change and make that part of the configuration if you can. They have something they want to do, and until you satisfy that, they are going to keep trying to get around you.
In Mastering Perl, I talk a bit about defeating obfucators. Even if you do things like making nonsense variables names and the like, modules such as B::Deparse and B::Deobfuscate, along with Perl tools such as Perl::Tidy, make it pretty easy for the knowledgable and motivated person to get your source. You don't have to worry about the unknowledgable and unmotivated so much because they don't know what to do with the code anyway.
When I talk to managers about this, we go through the normal cost benefit analysis. There is all sorts of stuff you could do, but not much of it costs less than the benefit you get.
Good luck,
Another not serious suggestion is to use Acme::Bleach, it will make your code very clean ;-)