I've found a few related questions, like Python vs. Perl (now deleted) and Is Perl Worth it? (now deleted), but I can't seem to find anything that directly addresses this question.
Is there a legitimate future in Perl? I work in a Perl shop right now, and I came from PHP so I see some of the advantages of an arguably "lower" level language when doing things on the server-level, but it seems to me a lot of the tasks in Perl can be performed more quickly in PHP, and SOME ARGUE (subjective, not my opinion) that Python does these tasks in a more explicit way that's easier to maintain.
Is having this job on my resume ultimately going to make me less employable, especially if the language no longer grows?
A few notes:
I love Perl, so don't think I'm bashing the language. It's fun to use and we use a fairly verbose syntax that is relatively easy to maintain.
I realize that "Vaporware" is a buzzword that isn't necessarily applicable to this situation, because Perl doesn't have a marketing department and they're not "promising" Perl 6 by any date.
I realize that CPAN keeps the community going, so whether Perl 6 comes out or not people continue to build modules that increase possibilities in the language, but that doesn't mean that industry shops realize this, and switch to "more supported" languages that keep coming out with revised versions of the language like Python and (especially) PHP.*
EDIT {CLARIFICATION}
Cade Roux and Telemachus both brought up good points about whether or not your future can be defined by your resume.
To be honest, this was brought up when one of my former employers said "I don't hire anyone with Perl as their last job. That's OLD technology." This was a PHP shop, so take all that with a grain of salt.
Now without defaming my former employer, she's not a tech person AT ALL, so she was really expressing an opinion of a layperson, and in this case my question was more along the lines of "Is there a stigma on this particular technology placed on it by people who don't utilize it?", specifically more along the lines of people who may have had past experience with similar employers. I'm not asking you to look into the future with a magic glass to assume what the next "hot" language would be, but rather if this particular language (which is accused of stunted growth, again by laypeople) has negative connotations placed upon it.
I hope that makes a little more sense.
Plenty of shops - including on Wall Street - heavily use Perl and will continue to do so.
However, I have never seen a PHP or Python used in this industry (not saying it is not used, but that I never encountered. Purely personal anecdote. Nor have I EVER heard any conversation of "Perl can not do X that Python can, let's use Python").
Perl6 is irrelevant to job picture.
Many shops are still on 5.8 or G-d forbid 5.6
More importantly, perl5 continues to evolve, including with features/ideas from Perl6. See Perl 5.10 and 5.11
Plus evolution includes really cool framework like Moose etc...
I can probably come up with more bullets later, but the summary is that no, having a Perl job will in no way negatively affect your career prospects.
However, knowing nothing but Perl may affect it negatively, so make sure you know Java, C#, C++ or something besides dynamic interpreted languages. Not many shops would hire "Perl Only" developer, even if they gladly hire "Perl + other stuff" ones.
See Tim Bunce's Perl Myths slides on slide share.
In short, Perl is not dead and has lots of jobs available.
Anyone who actually watches the development of Perl, would know that that there has perhaps been more work on the Perl language in the past decade, than in the previous decade.
This has been spurred on by the introduction of Perl6.
The introduction of Perl 6 spurred on, the now deeply ingrained, testing culture.
Just look at how much the Rakudo implementation of Perl 6, is tested:
Rakudo Progress http://rakudo.de/progress.png
There has also been a lot of back-porting of Perl 6 features into Perl 5.
For example, the Perl 6 "switch" statement
#!/usr/bin/perl
use strict;
use warnings;
use 5.10.1;
# or
use feature qw'switch say';
my $str = "testing 123";
given( $str ){
when(/(\d+)/){
say $1;
}
when( [0..10] ){
say $_, 'is equal to some number between 0 and 10';
# given, sets the current topic "$_"
}
}
There are few languages I would tie my career to. Perl will always be there and it will always be the best tool for certain kinds of jobs. But this is true for many languages. However, there are also languages which have more competition in some of the spaces where they are used. Perl is one language that has a lot more strong niches.
Still, you wouldn't restrict yourself to using just one language for your entire life - or even in one project if there are better options to solve a problem.
Career-wise, there are basic technologies which are fairly universally used, and of these I think a few of the most valuable are: relational database concepts and SQL, XML/HTML/HTTP/DOM, regular expressions. These are all basically independent of any particular vendor or language, and if you are strong in these areas, choice of language and platform are going to be informed by the problem being addressed.
Perl is, and always will be, a practical language for manipulating large amounts of data. I work in an industry where moving, converting, and parsing large amounts of text and image data is what we do, and I couldn't live without Perl.
Likewise, if you're a sysadmin (especially a Unix one), Perl is a necessary tool. There are tons of places where you need to be able to whip up a quick and dirty application that runs right along with the shell functions.
Languages have niches. Perl has a big stable niche, in many ways much more stable than fad-driven web languages. PHP, for example, is a nice little web language, but its saving grace is that it's quick and easy to develop in, not that it is a particularly great language. I'll tend to use PHP over Perl for web applications (though I use Python over PHP, if I have time), but 90% of the stuff I do in my day-to-day would be nearly impossible in PHP, and is flat trivial in Perl.
#Nate: I love Python. LOVE it. I actually worry that I love it too much, and I'm being irrational about it. PHP is a nice tool, but when your main selling point is "Quick and Easy" then you're running a risk. That was the big push behind original Visual Basic, and we all know how that worked out.
I'd discourage you from putting Perl on your resume - there's already too many people in the perl market and we don't want any more! ... just kidding.
The past is supposedly no guide to the future, but, despite having plenty of C (etc.) and Java in my 'skills toolbag' I've seen more gainful employ from my Perl than anything else over the last decade.
I suspect that offshore-perl-new-build may not be the biggest market in the future, but there's certainly active development in the city and media industries in the UK.
Otherwise, I'd just agree with the points above. Technicians with diverse skills are more able to pick the right tools, and less inclined to 'get religious' about language choice.
If you're looking at a post where the non-technical management have a strong point of view about what technology should and shouldn't be used - I'd place that one in the 'avoid' pile.
To add another separate answer - as you have noted - there is a very real danger when dealing with recruiters and others that your resume will be interpreted and things inferred that are not necessarily how you see yourself, and you might get pigeon-holed.
This WILL happen both ways - too much variation and you aren't an expert in anything OR too little variation and you are only good at one thing.
I don't have a simple answer for combatting that, except to ensure that you emphasize portable skills and also achievements which are independent of technology - making the company more money, landing new business, making new markets, etc.
Perl is another tool in your toolbox. If I have an opening and one person is narrow focused to a specific technology, and another has a broad range of skills I would be more inclined to hire the one with the wider range of skills even if they might not be quite as deeply knowledgeable. Some one who has a wide range of skills on a range of platforms is someone who can think, innovate and adapt.
I don't understand the point of this question. You have a job and you already know Perl. You can ask whether or not to learn new languages and which ones to learn (please don't, but you could), but none of us can or should predict whether or not you're going to get another job using Perl.
You ask, "Is having this job on my resume ultimately going to make me less employable, especially if the language no longer grows?"
Well, it's better than a blank resume, and you can't change your past, so really what are we talking about here?
Related
Here is an excellent question and the wonderful tchrist's answer with 7+24+52 advices&comments how to make an perl program utf8 safe.
But here is 19k CPAN modules. What is possible to do for differentiating "good" and "bad" ones? (from the utf8's point of view)
For example: File::Slurp if you will read the file with
#use strict encoding warnings utf8 autodie... etc....
my $str = read_file($file, binmode => ':utf8');
you will get different results based on command line switches, and perl -CSDA will not work. Sad. (Yes, i know than Encode::decode("utf8", read_file($file, binmode => ':raw')); will help, but SAD anyway.
My questions:
is here any preferred way, how to test/classify what CPAN modules are utf8 safe/ready/correct?
is here some Test::something already done for utf8 testing?
is here something like Perl::Critic for utf8 - what will check the module source for possible utf8 incorrectness? (because manually checking sources for 7+24+52 things i cannot classify as the "easy way to programming")
or any other way? :)
I understand, than much of CPAN modules simply does not need to know about utf8. But here are zilion others what should.
Please, don't misunderstand me. I love Perl language. I know than perl has extremely powerful utf8 capability. (especially 5.14). The above was not mean as perl critique - but me (and probably some others too) need to know what CPAN modules are OK, and how classify them...)
When doing development using several CPAN modules, and initially everything goes well but in the final testing, you find that some modules does not support utf8, and therefore part of your work is useless - that really can cause a bit disillusionment. :(
Edit:
I understand than all complicated things around the unicode has two roots:
unicode itself - as tchrist excellently analyzed some of problematic points
perl - simple can't break all working modules, live servers etc - so need maintaining backward compatibility.
My only hope: perl6. Is is an totally new and different language. Don't need maintaining any backward compatibility. So I hope, in perl6 will be default some things what is not possible do in perl5 and all utf8 things will be much more intuitive.
But, back to modules: #daxim told: "Authors won't even reveal whether their module is taint-safe, and this feature exists for decades!" - and this is a catastrophe. Maybe (big maybe, and honestly haven't idea how to do it), but maybe we arrived to the time, when need put much-much harder restrictions into CPAN submissions.
At one side i'm really very happy with volunteer works of CPAN authors. At the other side, publishing source code is not only like a free speech "right" - but should obey some rules too.
I understand, than is is nearly impossible make any "revolution", but we probably need some "evolution". Maybe flag any CPAN module what is not utf8 safe. Flag all what are not taint safe. Flag (like here in SO) what module does not meet the minimal coding standards and remove them. Maybe I'm an idealist and/or naive. :)
Chill, the situation is less dire than you're thinking. No one except tchrist operates on this level of Unicode correctness, also see Aristotle's recent commentary. As with all things, you get 80% of the way with 20% of the effort. This base effort, namely getting the topic of character encoding right, is well documented; and jrockway repeats it in his answer in that thread.
Replies to your specific questions:
No, there isn't. There is no concerted effort to collect this information in a central place. The Perl 5 wiki could be used to document problematic modules, Juerd already discusses some in uniadvice. I would really like to see a statement from each module author in their documentation that "this module DTRT w.r.t. encoding", but I don't see it happening. Authors won't even reveal whether their module is taint-safe, and this feature exists for decades!
encoding::warnings can be used to smoke out unintended upgrades. I mention it in the work-flow of Checklist for going the Unicode way with Perl
You can't do that with Perl::Critic or static analysis. I see no other way than knowledgable people poking at the module with pointy characters until it falls apart (or not), like mirod just commented.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I work as support staff in a biology research institute as a student, and Perl seems to be used everywhere. Not for every single project, but it seems that more than half the people here have a few Perl books in/on their office/desk.
Why is Perl used so much in biology?
Lincoln Stein highlighted some of the saving graces of Perl for bioinformatics in his article:
How Perl Saved the Human Genome Project.
From his analysis:
I think several factors are responsible:
Perl is remarkably good for slicing, dicing, twisting, wringing, smoothing, summarizing and otherwise mangling text. Although the biological sciences do involve a good deal of numeric analysis now, most of the primary data is still text: clone names, annotations, comments, bibliographic references. Even DNA sequences are textlike. Interconverting incompatible data formats is a matter of text mangling combined with some creative guesswork. Perl's powerful regular expression matching and string manipulation operators simplify this job in a way that isn't equalled by any other modern language.
Perl is forgiving. Biological data is often incomplete, fields can be missing, or a field that is expected to be present once occurs several times (because, for example, an experiment was run in duplicate), or the data was entered by hand and doesn't quite fit the expected format. Perl doesn't particularly mind if a value is empty or contains odd characters. Regular expressions can be written to pick up and correct a variety of common errors in data entry. Of course this flexibility can be also be a curse. I talk more about the problems with Perl below.
Perl is component-oriented. Perl encourages people to write their software in small modules, either using Perl library modules or with the classic Unix tool-oriented approach. External programs can easily be incorporated into a Perl script using a pipe, system call or socket. The dynamic loader introduced with Perl5 allows people to extend the Perl language with C routines or to make entire compiled libraries available for the Perl interpreter. An effort is currently under way to gather all the world's collected wisdom about biological data into a set of modules called "bioPerl" (discussed at length in an article to be published later in the Perl Journal).
Perl is easy to write and fast to develop in. The interpreter doesn't require you to declare all your function prototypes and data types in advance, new variables spring into existence as needed, calls to undefined functions only cause an error when the function is needed. The debugger works well with Emacs and allows a comfortable interactive style of development.
Perl is a good prototyping language. Because Perl is quick and dirty, it often makes sense to prototype new algorithms in Perl before moving them to a fast compiled language.
Sometimes it turns out that Perl is fast enough so that of the algorithm doesn't have to be ported; more frequently one can write a small core of the algorithm in C, compile it as a dynamically loaded module or external executable, and leave the rest of the application in Perl (for an example of a complex genome mapping application implemented in this way, see http://waldo.wi.mit.edu/ftp/distribution/software/rhmapper/).
Perl is a good language for Web CGI scripting, and is growing in importance as more labs turn to the Web for publishing their data.
The real answer probably has less to do with Perl than you think. Many of the things that happen are accidents of history. At the time, way back when, Perl was pretty popular, Java was getting more popular, not too many people were paying attention to Python, and Ruby was just getting started.
The people who needed to get work done used Perl and made some libraries in Perl, and other people started using those libraries. Once people start using something that is moderately useful to them, they tend not to switch (economists call those "switching costs"). From there, even more people start using it because a lot of other people are using it.
The same evolution might not happen today. I'd say that Perl, Python, and Ruby are all completely adequate and up to the task. All the things that mobrule quotes from Lincoln Stein could apply to any of the three today. If everyone had to start from scratch today, any one of those languages could be the one that everyone uses.
I've noticed, from my own client base though (a very small and unrepresentative sample of biotech), that the people pushing the programming for a lot of the biological stuff seemed to be at least part-time sysadmins who were supporting scientists. The scientists worried about the science and did some light programming, but the IT support people were doing a lot of the heavy lifting for the non-science parts. Perl is very well positioned as a sysadmin tool since it's the duct-tape of the internet.
Probably because Perl is good at manipulating strings, and much research in genetics involves the manipulation of veeery long "ACTGCATG..." strings. Just guessing...
I use lots of Perl for dealing with qualitative and quantitative data in social science research. In terms of getting things done (largely with text) quickly, finding libraries on CPAN (nice central location), and generally just getting things done quickly, it can't be surpassed.
Perl is also excellent glue, so if you have some instrumental records, and you need to glue them to data analysis routines, then Perl is your language.
Perl seems to be the language of choice for bioinformatics - there's even an O'Reilly title on just this subject: Beginning Perl for Bioinformatics.
Perl is very powerful when it comes to deal with text and it's present in almost every Linux/Unix distribution. In bioinformatics, not only are sequence data very easy to manipulate with Perl, but also most of the bionformatics algorithms will output some kind of text results.
Then, the biggest bioinformatics centers like the EBI had that great guy, Ewan Birney, who was leading the BioPerl project. That library has lots of parsers for every kind of popular bioinformatics algorithms' results, and for manipulating the different sequence formats used in major sequence databases.
Nowadays, however, Perl is not the only language used by bioinformaticians: along with sequence data, labs produce more and more different kinds of data types and other languages are more often used in those areas.
The R statistics programming language for example, is widely used for statistical analysis of microarray and qPCR data (among others). Again, why are we using it so much? Because it has great libraries for that kind of data (see bioconductor project).
Now when it comes to web development, CGI is not really state of the art today, but people who know Perl may stick to it. In my company though it is no longer used...
I hope this helps.
Perl basically forces very short development cycles. That's the kind of development that gets stuff done.
It's enough to outweigh Perl's disadvantages.
Bioinformatics deals primarily in text parsing and Perl is the best programming language for the job as it is made for string parsing. As the O'Reilly book (Beginning Perl for Bioinformatics) says that "With [Perl]s highly developed capacity to detect patterns in data, Perl has become one of the most popular languages for biological data analysis."
This seems to be a pretty comprehensive response. Perhaps one thing missing, however, is that most biologists (until recently, perhaps) don't have much programming experience at all. The learning curve for Perl is much lower than for compiled languages (like C or Java), and yet Perl still provides a ton of features when it comes to text processing. So what if it takes longer to run? Biologists can definitely handle that. Lab experiments routinely take one hour or more finish, so waiting a few extra minutes for that data processing to finish isn't going to kill them!
Just note that I am talking here about biologists that program out of necessity. I understand that there are some very skilled programmers and computer scientists out there that use Perl as well, and these comments may not apply to them.
People missed out DBI, the Perl abstract database interface that makes it really easy to work with bioinformatic databases.
There is also the one-liner angle. You can write something to reformat data in a single line in Perl and just use the -pe flag to embed that at the command line. Many people using AWK and sed moved to Perl. Even in full programs, file I/O is incredibly easy and quick to write, and text transformation is expressive at a high level compared to any engineering language around. People who use Java or even Python for one-off text transformation are just too lazy to learn another language. Java especially has a high dependence on the JVM implementation and its I/O performance.
At least you know how fast or slow Perl will be everywhere, slightly slower than C I/O. Don't learn grep, cut, sed, or AWK; just learn Perl as your command line tool, even if you don't produce large programs with it. Regarding CGI, Perl has plenty of better web frameworks such as Catalyst and Mojolicious, but the mindshare definitely came from CGI and bioinformatics being one of the earliest heavy users of the Internet.
Perl is very easy to learn as compared to other languages. It can fully exploit the biological data which is becoming the big data. It can manipulate big data and perform good for manipulation data curation and all type of DNA programming, automation of biology has become easy due languages like Perl, Python and Ruby. It is very easy for those who are knowing biology, but not knowing how to program that in other programming languages.
Personally, and I know this will date me, but it's because I learned Perl first. I was being asked to take FASTA files and mix with other FASTA files. Perl was the recommended tool when I asked around.
At the time I'd been through a few computer science classes, but I didn't really know programming all that well.
Perl proved fairly easy to learn. Once I'd gotten regular expressions into my head I was parsing and making new FASTA files within a day.
As has been suggested, I was not a programmer. I was a biochemistry graduate working in a lab, and I'd made the mistake of setting up a Linux server where everyone could see me. This was back in the day when that was an all-day project.
Anyway, Perl became my goto for anything I needed to do around the lab. It was awesome, easy to use, super flexible, other Perl guys in other labs we're a lot like me.
So, to cut it short, Perl is easy to learn, flexible and forgiving, and it did what I needed.
Once I really got into bioinformatics I picked up R, Python, and even Java. Perl is not that great at helping to create maintainable code, mostly because it is so flexible. Now I just use the language for the job, but Perl is still one of my favorite languages, like a first kiss or something.
To reiterate, most bioinformatics folks learned coding by just kluging stuff together, and most of the time you're just trying to get an answer for the principal investigator (PI), so you can't spend days on code design. Perl is superb at just getting an answer, it probably won't work a second time, and you will not understand anything in your own code if you see it six months later; BUT if you need something now, then it is a good choice even though I mostly use Python now.
I hope that gives you an answer from someone who lived it.
A couple of years back I participated in writing the best practices/coding style for our (fairly large and often Perl-using) company. It was done by a committee of "senior" Perl developers.
As anything done by consensus, it had parts which everyone disagreed with. Duh.
The part that rubbed wrong the most was a strong recommendation to NOT use many Perlisms (loosely defined as code idioms not present in, say C++ or Java), such as "Avoid using '... unless X;' constructs".
The main rationale posited for such rules as this one was that non-Perl developers would have much harder time with the Perl code base otherwise. The assumption here I guess is that Perl code jockeys are rarer breed overall - and among new hires to the company - than non-Perlers.
I was wondering whether SO has any good arguments to support or reject this logic... it is mostly academic curiosity at this point as the company's Perl coding standard is ossified and will never be revised again as far as I'm aware.
P.S. Just to be clear, the question is in the context I noted - the answer for an all-Perl smaller development shop is obviously a resounding "use Perl to its maximum capability".
I write code assuming that a competent Perl programmer will be reading it. I don't go out of my way to be clever, but I don't dumb it down either.
If you're writing code for people who don't know the language, you're going to miss most of the point of using that language. I often find that people want to outlaw Perlisms because they refuse to learn any more than they already know.
Since you say that you are in a small Perl shop, it should be pretty easy to ask the person who wrote the code what it means if you don't understand it. That sort of stuff should come up in code reviews and so on. Everyone continues to learn more about the language as you have periodic and regular chances to review the code. You shouldn't let too much time elapse without other eyeballs looking at someone's code. You certainly shouldn't wait until a week after they leave the company.
As for new hires, I'm always puzzled why anyone would think that you should sit them in front of a keyboard and turn them loose expecting productive work in a codebase they have never seen.
This isn't limited to Perl, either. It's a general programming issue. You should always be learning more about your tools. Most of the big shops I know have mini-bootcamps to bring developers up to speed on the codebase, including any bits of tricky code they may encounter.
I ask myself two simple questions:
Am I doing this because it's devilishly clever and/or shows off my extensive knowledge of Perl arcana?
Then it's a bad idea. But,
Am I doing this because it's idiomatic Perl and benefits from Perl's distinct advantages?
Then it's a good idea.
I see no justifiable reason to reject, say, string interpolation just because Java and C don't have it. unless is a funny one but I think having a subroutine start with the occasional
return undef unless <something>;
isn't so bad.
What sort of perlisms do you mean?
Good:
idiomatic for loops: for(1..5) {} or for( #foo ) {}
Scalar context evaluation of arrays: my $count = #items;
map, grep and sort: my %foo = map { $_->id => $_ } #objects;
OK if limited:
statement modifier control - trailing if, unless, etc.
Restrict to error trapping and early returns. die "Bad juju\n" unless $foo eq 'good juju';
As Schwern pointed out, another good use is conditional assignment of default values: my $foo = shift; $foo = 'blarg' unless defined $foo;. This usage is, IMO, cleaner than a my $foo = defined $_[0] ? shift : 'blarg';.
Reason to avoid: if you need to add additional behaviors to the check or an else, you have a big reformatting job. IMO, the hassle to redo a statement (even in a good editor) is more disruptive than typing several "unnecessary" blocks.
Prototypes - use only to create filtery functions like map. Prototypes are compiler hints not 'prototypes' in the sense of any other language.
Logical operators - standardize on when to use and and or vs. && and ||. All your code should be consistent. Best if you use a Perl::Critic policy to enforce.
Avoid:
Local variables. Dynamic scope is damn weird, and local is not the same as local anywhere else.
Package variables. Enables bad practices. If you think you need globally shared state, refactor. If you still need globally shared state, use a singleton.
Symbol table hackery
It must have been, as you say, a few years ago, because Damian Conway has 'cornered the market' in Perl standards with Perl Best Practices for the last few years.
I've worked in a similarly ossified environment - where we were not allowed to adopt the latest best practices, because that would be a change, and no one at a sufficiently high level in the corporate structure understood (or could be bothered to understand) Perl and sign off on moving in to the 21st Century.
A corporation that deploys a technology and retains it, but doesn't either buy in expertise or train up in house, is asking for trouble.
(I'd guess you're working in a highly change-controlled environment - financial perhaps?)
I agree with brian on this by the way.
I'd say Moose kills off 99.9% of Perl-isms, by convention, that shouldn't be used: symbol table hackery, reblessing objects, common blackbox violations: treating objects as arrays or hashes. The great thing, is it does all of this without taking the functionality hit of "not using it".
If the "perl-isms" you're really referring to are mutator form (warn "bad idea" unless $good_idea), unless, and until then I don't think you really have much of an argument because these "perlisms" don't seem to inhibit readability to either perl users, or non-perl users.
Pick up a copy of Effective Perl Programming: Ways to Write Better, More Idiomatic Perl (2nd Edition), and treat that as a guideline. It contains many of the better idioms and is packed with the little bits of information that will get you writing good Perl style Perl code, as opposed to C or Java (or whatever) style Perl code.
I've heard that Perl is the go-to language for string manipulation (and line noise ;). Can someone provide examples and comparisons with other language(s) to show me why?
It is very subjective, so I wouldn't say that Perl is the best choice, but it is certainly a valid choice for string manipulation. Other alternatives are Tcl, Python, AWK, etc.
I like Perl's capabilities because it has excellent support (better than POSIX as pointed out in the comment) for fast regexs and the implicit variables makes it easy to do basic string crunching with very little code.
If you have a *nix background a lot of what you already know will apply to Perl as well, which makes it fairly easy to pick up for a lot of people.
Perl -> Practical Extraction and Reporting Language
Perl's strength(when it comes to string processing) lies in it's very powerful Regular expression engine.
Because of this there are many people in the field of BioInformatics using Perl as their
main tool, hence the large number of posts about BioPerl on PerlMonks . In BioInformatics they work with strings a lot , they call them "sequences"(I don't know much about this).
Perlmonks.org is the heart of the Perl community, check out the immense number of hits
when you search for site:perlmonks.org regex 20,000 hits
You cannot ignore the sheer number of modules on CPAN:
375 modules under the namespace String on CPAN(Perl's module repository)
241 in Regex namespace
156 in Regexp namespace.
This is very clear evidence that Perl is a very powerful language when it comes to string processing.
So if you want to do some string processing and you're using Perl, you've got it covered :)
To address the second part of your question: Perl's reputation for line noise comes from 4 kinds of people:
Overly clever (for their own good) hackers (or sometimes just hacks) who value cleverness and showing off over readability. "If it was hard to write it should be hard to read" is NOT just a mythical attitude.
People who wouldn't know good software development if it hit them over the head with a cluebat. Such as people who save a couple of characters in a program by using $_ instead of a named variable. In a nested scope. Or never heard of comments. Or self-documenting identifiers. Or whitespace.
People who think that software development == code golf. More seriously, that the less the amount of characters in the code, the more readable it is, because they misunderstand what "conciseness" means in code.
(NOTE: first 2 sets are not mutually exclusive)
People who code/hack in perl (e.g. SysAdmins) who have very little training, experience or incentive to do software development. E.g. the percentage of people using Perl who do quick and dirty hacks with bad style and worse code quality is probably higher than, say Python.
Just for reference, 80% of awful Perl "code" in my $work falls under this - it was written by financial analysts who are smart enough to pick up a Perl book and some earlier scripts, clone off a script that does what business need is, and don't have CS/programming background to worry about how readable/maintainable their code was.
In other (and less snide) words, you can write beautiful, incredibly readable and easy to maintain software in Perl. It all depends on who does the writing, what their priorities and skills are. Also, just like with any other language, you can write a miserable write-only mess with it.
The difference from other languages is that very often, the write-onlyness of said mess, when done in Perl, does indeed consist of very high density of non-letter characters (sygils and special characters in poorly written RegExes). This high density can indeed, asymptotically approximate line noise.
Because It is what is perl made for. Because Perl is expressive, powerful and fast. I have beaten many times specialized products with small and dirty script in perl written in few minutes. For example, outer join and large join vs. MySQL (just because can't do merge join), ETL processing vs. Java Hadoop (because I have years experience to write it effectively and perl IO layer is just great) and so and so.
It's a very subjective question. Perhaps the true answer is that Perl has a nice syntax (incl. the regex syntax) that makes people want to sign it high praises over other languages? IMHO, any language that supports a rich regex syntax would be considerablly powerfull at string manipulation.
Kids these days! Back in the day, all we had was SNOBOL -- and we liked it! Try it sometime...you never know, you might want something respectable to fall back on when this Perl fad runs its course!
Perl is widely used for string manipulation tasks as its string manipulation API is easy to learn. And also its regex is widely used. It has been in use for a very long time and anyone with a Unix background would pick up perl very easily. Historically, perl was developed in the late 80's for report processing tasks and was "originally" developed for text processing tasks. So till date, the trend continues as anyone with a string manipulation task or text processing task would opt for perl as the first choice. Its not that other languages like python arent up to the task, but perl's popular in this area.
I like Perl a lot, write books about it, publish a magazine about it, and so on. I don't think I would ever say it's the best language to do anything in. A lot of that has to do with the task you need to do. For many string processing tasks, ETL, data cleanup, and so in, Perl is a very strong and capable language. You wouldn't have that much trouble doing simple tasks.
Your comment sounds like it comes from the early 1990s though, when the rest of the world hadn't caught up. Many of the dynamic languages are now up to task, so you might not have to switch languages. If you decide to use Perl and run into problems, there are plenty of people here who are willing to help, and not all of us will fault you if you choose something else. :)
At the beginning, Perl was developed for easy report processing and dealing with text files, thus it's got a very strong REGEX support. Most of the info on REGEX you can find in perldoc.
Perl was the go-to language for a long time. The problem is it can be pretty messy and difficult to maintain (some people can write Perl that avoids this, but it is very easy to wrote ugly code). I would not tell you to avoid Perl, but many have moved on to some modern alternatives.
I would recommend learning one of the newer scripting languages such as Python or Ruby. Both will work very well for your needs, and can easily handle more difficult tasks later on. They're both quite nice to work in, after having written C and Perl for so long.
In short, Perl would be a good hammer for this nail. Python and Ruby would be nail-guns.
I disagree that Perl is the best language for text processing. Simple things are easy; to replace foo with bar:
$data =~ s/foo/bar/g;
Harder things are not simple, though. Look at Data::SExpression, for example. It is a lot of code to do something very simple.
An similar implementation in Haskell with PArrow looks something like:
import Text.ParserCombinators.PArrow
data Atom = QuotedString String | Symbol String
deriving (Show, Eq)
data Sexp = Sexp [Sexp] | Atom Atom
deriving (Eq)
quotedString :: Char -> Char -> MD a Atom
quotedString quoteChar escapeChar = between q q inside >>^ QuotedString
where q = char quoteChar
inside = many $ (char escapeChar >>> anyChar) <+> notChar quoteChar
doubleQuotedString, symbol :: MD a Atom
doubleQuotedString = quotedString '"' '\\'
symbol = word >>^ Symbol
atom, sexp :: MD a Sexp
atom = (doubleQuotedString <+> symbol) >>^ Atom
sexp = atom <+> (between (char '(') (char ')') sexp' >>^ Sexp)
where sexp' = sepBy1 sexp spaces
Just sayin'. Perl is not the end-all-and-be-all of text manipulation. There are many reasons to prefer Perl to other languages, but parsing is not one of them.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
It is well known that different people have different aptitudes regarding various programming paradigms (e.g. some people have trouble learning non-procedural, especially functional languages. Some people have trouble understanding pointers - see Joel Spolsky's blog for musings on that. Some people have trouble grasping recursion).
I was recently reading about a study that looked at how the grammar of someone's native language affected their speed of learning math. Can't find that article now but a quick googling found this reference.
That led me to wondering whether someone's native culture or first language might affect their aptitude towards various programming paradigms. I'm more curious about positive influences - e.g. some trait that make it easier/faster for someone to learn a particular paradigm, for example native language grammar being very recursion-oriented.
To be clear, I'm looking for how culture/language grammare may affect the difference between aptitude of the same person towards various paradigms as opposed to how it affects overall aptitude towards programming between different persons.
Important: the only answers I'm interested in are either references to scientific studies, or personal observations from someone intimately familiar with a particular culture/language, including from their own experience.
E.g. I'm not interested in your opinion of how Chinese being your first language affects anything unless you speak Chinese or worked with extremely large set of Chinese-native programmers extensively.
I'm OK with your guesstimates not based on scientific studies, but please be sure to supply your reasoning about plausible causes of your observation.
I'm not interested in culture-bashing (any such commends will be deleted or flagged for deletion).
I'm also not particularly interested in culture-building - we all know Linus is from Finland and Tetris was written in Russia and Larry Wall is an American. Any culture/nation can produce a brilliant mind in any discipline. I'm interested in averages.
Disclaimer: I was a Cultural Anthropologist before I got into programming, so you know I'm going to be on a high horse, here.
Obviously, a person's history will have an impact on their aptitude for any particular task, but I think this has less to do with the structure or grammar of a person's language than it does with the particular material conditions of the culture in which that language is spoken.
For example, a pair of Anthropologists in the 60's went to various African communities and tested people's susceptibility to various optical illusions. Here is a classic one:
In this illusion, the bottom line looks longer, because the angled lines connecting it make it appear to be off in the distance.
These Anthropologists found that in many African cultures, the illusion doesn't work at all - people consider the lines to be the same length. By refining their study, they found that the only people who were susceptible to the illusion were people who had grown up in an urban environment. They hypothesized that the illusion did not work on people from remote jungle environments, because these people had little or no experience with right angles and seeing things at very long distances.
My point with this is that even if you successfully found a correlation between programmers' native languages and their abilities with certain aspects of programming, you couldn't be sure that the correlation wasn't spurious. For example, you might think that Asians tend to be bad drivers, and you might even be able to demonstrate this statistically. If you then concluded, however, that "bad driving" is some sort of fundamental characteristic of Asian-ness, you would be ignoring the fact that Asians are more likely to be from Asia, and thus to have had much less experience driving cars (or even being in cars) while growing up than Westerners (and especially Americans) have had.
With programming, we might think that a particular language inhibits programming ability, and not take note of the fact that the society in which that language is spoken has much less access to computers, and thus people growing up with that language appear to have less programming aptitude or ability to understand certain programming concepts.
In short, I wouldn't give much credence to the idea that language inhibits anyone's ability to understand anything in particular. The human mind is much too flexible and adaptable for that to be true.
This seems analogous to the Sapir-Whorf Hypothesis - that the facilities of a language affect the ease which which one can cogitate about certain subjects, or in the words of the Wikipedia article:
"The linguistic relativity principle (also known as the Sapir-Whorf Hypothesis) is the idea that the varying cultural concepts and categories inherent in different languages affect the cognitive classification of the experienced world in such a way that speakers of different languages think and behave differently because of it."
( http://en.wikipedia.org/wiki/Linguistic_relativity )
While there appears to be little definitive information here, the discussions appear to be relevant to the question, and perhaps worthy of further exploration.
Just a few random thoughts. I think the influence is generally very weak and can most of the time be neglected but they do exist and sometimes they can make us feel them.
In Chinese grammar, for example, we don't quite distinguish between plural and singular forms, but I wouldn't think we Chinese have any noticeable difficulty understanding the concepts of scalar and array in Perl. The reason might be this: although we generally don't need particular suffixes or changes in form to indicate whether something is singular or plural, we do have the concepts of plural and singular and we mostly depend upon the context to tell them apart. Grammar-wise, the context in Chinese may possibly be way more important than that in those languages belonging to indo-european family. We omit a lot of things sometimes when they have already been mentioned and sometimes when we just presume that these things can be implicitly well understood by the listener. In either case, we don't need those indefinite and definite articles (a, an, the) or those relative pronouns like, that, which and who, to indicate whether they're being mentioned for the first time or yet another time again. Maybe that's partially why I feel very comfortable with Perl's default variable "$". print; chomp; split; all act upon $, which has never ever been mentioned. But this is quite subjective.
I think the Chinese language is more characterized by implicitness and fuzziness than Indo-european languages. For example, We never ever pay attention to subject verb agreement and we never ever do verbal conjugation to denote tenses. This could mean that the Chinese are inclined use a not quite so logical mode of thinking. One of my teachers onced used an example to try to generalize (or maybe over-generalize)the difference between Chinese non-logical mode of thinking and American logical mode of thinking.
If the American version of quarrelling should be this:
“I can lick you.”
“No, you can’t.”
“Yes, I can.”
“No, you can’t.”
“I can.”
“you can’t.”
“Can!”
“Can’t!”
The Chinese version (translated in English) would be something like this:
I can lick you.
How dare you!
What if I dare?
Then you try.
Try? Hm, you wait and see.
Wait and see? I’m not afraid.
Not afraid? OK. You don’t run away.
Who runs away? Come on and lick
Well, I agree that there may be some differences between Chinese way of thinking and that of other countries but the example looks like a stereotype because the Chinese may easily switch to the use of the American version. Back to the question, I think the language and culture may indeed influence a programmer's learning process in one way or another but this influence is defninitely not decidingly noticeable. Maybe because of the culture you're exposed to makes you feel a little bit uncomfortable to get used to some notions in some programming language, recursion or whatever, but time will solve it.
I was recently reading about a study that looked at how the grammar of someone's native language affected their speed of learning math. ... Important: the only answers I'm interested in are either references to scientific studies, or personal observations from someone intimately familiar with a particular culture/language, including from their own experience.
I learned a lot of maths before I started programming (enough to count as "intimately familiar"), and IMO programming is relatively easy: more tangible.
Sometimes I've wondered whether it's beneficial to know more than one human language: if you only know one language, then you might think of the words "cat" and "dog" as being values, i.e. synonymous with cat and dog objects; but if you're fluent in more than one language, then "cat" and "dog" become pointers: because for example the French words "chat" and "chien" are referring/pointing to the same objects as "cat" and "dog", and so clearly there's a distinction between the word and the object.
It's disappointing that you post the question without linking to the article which inspired it. I thought of "reverse polish notation" and wondered whether that was at all the kind of differences in "grammar" that were considered in the original study.
The reference you cite seems to rest on the assumption that making it easier helps with learning. In my understanding, there is a countereffect: without enough challange, you're not learning enough.
There are theories/studies (anyone with a link?) that development of language created crucial pressure on expanding the cerebral cortex and thus "made us human". (in very darwinistic terms: more grey matter ==> better language capabilities ==> better teamwork ==> better survival as a group). So language complexity can't be all bad for learning.
(My only qualification is being an eager follower of The Frontal Cortex blog, so take this with a grain of salt.)
In german we have a strange ordering of numbers: 10^0 and 10^1 positions are switched, but others are normal, (e.g. 25 is 'five and twenty', 125 is 'one hundred five and twenty'). It's been claimed that this makes learning numbers harder, and thus german should adopt a more intuitive ordering.
I guess that it helps a lot with doing additions in your head - at least if you stay below 100 or 200 - You can first add the 10^0 position and already say it / write it down while taking any carry into account for the 10^1 position.
(That doesn't continue for 10^2, I guess that would be done in writing by the majority anyway)
Also: abstractions. There are languages where numbers aren't abstracted from objects, "two coconuts" and "two sabretooth tigers" don't share a common "two" word / concept. Such a language would probably be very bad for developing math skills. Here the abstraction (separating number and object) in language is important.
Generally, I'd say the language has a strong effect on shaping a developing mind, and I see no reason why this should not extend to culture.
Of course it's still open what would be the "right kind of complexity" - for what, and how particular language features affect general improvement vs. establishment of an elite (i.e. "sharpening the skills of the gifted, while hampering the rest").
Interesting Question, no doubt - looking forward to other replies.