Sublime syntax highlighting perl qw, qq, q not working fully - perl

From 1 week i use sublime. And i'm very pleased. But i have little problem. I write in perl with sublime.
Here is the problem:
Sublime did not recognize that 'some string is quoted and $test_scalar and everything after it like it is string. When i type it like that:
There is no problem.
I tried with the Perl.tmLanguage file, but i did not understand it.
Can someone help me please?

Perl is one of the few programming languages that use this type of construct for quoting strings, and many program editors simply don't get it.
Imagine you're writing a syntax highlighter, and you have to understand all of these are the same:
my $string = "this is my string";
my $string = qq(This is my string);
my $string = qq/This is my string/;
my $string = qq#This is my string#;
my $string = qq
(This is my string);
Your syntax highlighter would have to understand that q, qq, and qx are quoting options, and that the character following them (after possible white space) is the character that's doing the quoting. Oh, and also that if the character is a (, a {, or a [, the closing quote is a ), }, or a ]. And, that this can be on more than one line. And, you really only need this for Perl.
I know that VIM can handle the qq quoting issue, but many other program editors I have tried failed. Even Stackoverflow's syntax highlighter (Google's prettify) fails.
Try Notepad++ or Textpad if you're on Windows. Or, try Eclipse with the EPIC editor. I believe that one also works.

Because Perl5 can't be statically parsed, editors have to make guesses about syntax. Could they do a better job in this case? Probably, but do keep in mind that it's impossible to do this perfectly.
In any case, your best bet is to get in touch with the author of the Perl syntax highlighting plugin for your editor.

As you said there is no problem and no syntax error. It is normal behavior either for sublime or vim editor. When you go on write qq operator on next line then highlighting string doesn't works on either editors.

cperl-mode.el for Emacs does the job:
Maybe you can take a look at it's source and try to use the same rules in Sublime or at last point this to the plugin author.

Related

String or string literal fails

I am trying something simple to follow exercises in a book. For example, typing “hello” at the prompt in the interactions window.
I get the the following error:
“a”: unbound identifier in module in: “a”
I believe simple things like this worked before, so I want to know what to check to resolve this problem.
Your problem are the quotation marks, a very common problem. Look:
“a”
The quotation marks look italic.
They should be like this: "a".
Copy paste this into your REPL and print return (this time it will work!):
"hello"
This is written with the right quoation marks "" and not “” .
If you copy paste from pdf books somteims this wrong quotation marks appear as a result (like Realm of Racket - because recently I had that problem when copy pasting from it). (Quotation marks from MS Word when using Times Romans fonts are also from this strange type, and in some programming blogs, too, the quotation marks are spoiled when copy pasting out of them).
How to avoid it?: Type the examples manually into the DrRacket editor. - problem solved! Plus you learn the things anyway much better if you type them yourself - ("the hard way" approach ;) ).
And you learn, that even copy pasting is a skill which one sometimes has to learn anew - welcome to programming (the long road of learning) :D .
Remember to enter the quotes " around the hello too.
"hello" is a string which contains the text hello
hello is a name of an variable (an identifier),
so if you haven't defined the name hello you get an
error saying that the identifier is undefined

Perl's Unicode::UCD::charscript function isn't identifying my character

The character here is 户, which is U+6237 CJK UNIFIED IDEOGRAPH-6237. But the charscript function provided by the standard Unicode::UCD module is returning an undefined value:
perl -MUnicode::UCD=charscript -wle 'print charscript(chr(0x6237)) // "undef"'
This prints undef. I am using Perl 5.14.2 or 5.18.1; the problem occurs with both versions.
I understand that the character could just as easily be part of a Japanese or even a Korean text, but charscript doesn't even say something like "CJK ideograph"; instead it just returns undef, which is not useful.
What I really want to do is to write a program that I can use to filter my incoming email; messages with subjects in Chinese should be flagged. (I can't read Chinese, and legitimate correspondents know this, and so don't send me mail written in Chinese.) And I have a perfectly good subject line written in Chinese, so I thought to use charscript to help recognize that, but it seems that it doesn't.
Why doesn't charscript return something more useful than undef here?
Is charscript the right thing to use for this?
If not, what is?
[ Added a little later: I checked the relevant Unicode data file, Scripts.txt, and it identifies the script of this character as Han, which, had it been returned by charscript, I would have considered an acceptable result. So the problem really seems to be with the software, and not with my understanding of Unicode. ]
Look at the examples. The usage is
charscript(0x6237)
or
charscript('Han')
You're doing
charscript('户')

Odd characters in Sublime Text 2: `SOH` and `ACK`

In converting old notes from org syntax to mmd, I used the Clean Text app to remove extra line breaks and non-unicode characters, and to convert to safe line-endings. When pasting the text back into Sublime Text 2, I noticed several odd characters. I don't really care too much about why they're there, I'd just like to know what the characters are, and if they're searchable using a regex?
They are control characters, they don't have a printable representation. I don't know how they ended up in your file.
In a regex, you can search for SOH with \u0001 and ACK with \u0006
I said I didn't care about the reason at the time. I did eventually find out what the problem was the next time this happened. Turned out it had to do with running a LaTeX auto-build tool in the background. It was already being compiled in one shell and hung due to an error. The next time I tried to compile, these control characters started showing up in my editor.
I no longer make assumptions about the number of processes I have running. ps aux | grep <whatever> is your friend.

Comment token inside string

In pig etc. /* begins a block comment. If I put this in a regex string 'blah/blah/*', emacs thinks this is a block comment and syntax highlighting goes to hell. I am not familiar with elisp but I am certain that is a problem with script that is providing annotations for pig.
How can I fix it?
phils pointed out a better designed major mode in the question comments, but since you are still curious: The pig mode version you are using doesn't have the syntax table set up right. The most reliable way for emacs to recognize comments and strings is to use the syntax table to map characters to start/end of comments and strings. The version you are using is trying to do it with font-lock.
You have to escape the \'es and the *. All the characters that are used by the regexp engine, have to be escaped.
If you want to match "\", you might have to write "\\" when using replace-regexp interactively and "\\\\" if you use it as a lisp function.
(I even have to escape my escapes in this comment, so there are 8 escapes in the last escape sequence above)

Are quotes around hash keys a good practice in Perl?

Is it a good idea to quote keys when using a hash in Perl?
I am working on an extremely large legacy Perl code base and trying to adopt a lot of the best practices suggested by Damian Conway in Perl Best Practices. I know that best practices are always a touchy subject with programmers, but hopefully I can get some good answers on this one without starting a flame war. I also know that this is probably something that a lot of people wouldn't argue over due to it being a minor issue, but I'm trying to get a solid list of guidelines to follow as I work my way through this code base.
In the Perl Best Practices book by Damian Conway, there is this example which shows how alignment helps legibility of a section of code, but it doesn't mention (anywhere in the book that I can find) anything about quoting the hash keys.
$ident{ name } = standardize_name($name);
$ident{ age } = time - $birth_date;
$ident{ status } = 'active';
Wouldn't this be better written with quotes to emphasize that you are not using bare words?
$ident{ 'name' } = standardize_name($name);
$ident{ 'age' } = time - $birth_date;
$ident{ 'status' } = 'active';
Without quotes is better. It's in {} so it's obvious that you are not using barewords, plus it is both easier to read and type (two less symbols). But all of this depends on the programmer, of course.
When specifying constant string hash keys, you should always use (single) quotes. E.g., $hash{'key'} This is the best choice because it obviates the need to think about this issue and results in consistent formatting. If you leave off the quotes sometimes, you have to remember to add them when your key contains internal hypens, spaces, or other special characters. You must use quotes in those cases, leading to inconsistent formatting (sometimes unquoted, sometimes quoted). Quoted keys are also more likely to be syntax-highlighted by your editor.
Here's an example where using the "quoted sometimes, not quoted other times" convention can get you into trouble:
$settings{unlink-devices} = 1; # I saved two characters!
That'll compile just fine under use strict, but won't quite do what you expect at runtime. Hash keys are strings. Strings should be quoted as appropriate for their content: single quotes for literal strings, double quotes to allow variable interpolation. Quote your hash keys. It's the safest convention and the simplest to understand and follow.
I never single-quote hash keys. I know that {} basically works like quotes do, except in special cases (a +, and double-quotes). My editor knows this too, and gives me some color-based cues to make sure that I did what I intended.
Using single-quotes everywhere seems to me like a "defensive" practice perpetrated by people that don't know Perl. Save some keyboard wear and learn Perl :)
With the rant out of the way, the real reason I am posting this comment...the other comments seem to have missed the fact that + will "unquote" a bareword. That means you can write:
sub foo {
$hash{+shift} = 42;
}
or:
use constant foo => 'OH HAI';
$hash{+foo} = 'I AM A LOLCAT';
So it's pretty clear that +shift means "call the shift function" and shift means "the string 'shift'".
I will also point out that cperl-mode highlights all of the various cases correctly. If it doesn't, ping me on IRC and I will fix it :)
(Oh, and one more thing. I do quote attribute names in Moose, as in has 'foo' => .... This is a habit I picked up from working with stevan, and although I think it looks nice... it is a bit inconsistent with the rest of my code. Maybe I will stop doing it soon.)
Quoteless hash keys received syntax-level attention from Larry Wall to make sure that there would be no reason for them to be other than best practice. Don't sweat the quotes.
(Incidentally, quotes on array keys are best practice in PHP, and there can be serious consequences to failing to use them, not to mention tons of E_WARNINGs. Okay in Perl != okay in PHP.)
I don't think there's a best practice on this one. Personally I use them in hash keys like so:
$ident{'name'} = standardize_name($name);
but don't use them to the left of the arrow operator:
$ident = {name => standardize_name($name)};
Don't ask me why, it's just the way I do it :)
I think the most important thing you can do is to always, always, always:
use strict;
use warnings;
That way the compiler will catch any semantic errors for you, leaving you less likely to mistype something, whichever way you decide to go.
And the second most important thing is to be consistent.
I go without quotes, just because it's less to type and read and worry about. The times when I have a key which won't be auto-quoted are few and far between so as not to be worth all the extra work and clutter. Perhaps my choice of hash keys have changed to fit my style, which is just as well. Avoid the edge cases entirely.
It is sort of the same reason I use " by default. It's more common for me to plop a variable in the middle of a string than to use a character that I don't want interpolated. Which is to say, I've more often written 'Hello, my name is $name' than "You owe me $1000".
At least, quoting prevent syntax highlighting reserved words in not-so-perfect editors. Check out:
$i{keys} = $a;
$i{values} = [1,2];
...
I prefer to go without quotes, unless I want some string interpolation. And then I use double quotes. I liken it to literal numbers. Perl would really allow you to do the following:
$achoo['1'] = 'kleenex';
$achoo['14'] = 'hankies';
But nobody does that. And it doesn't help with clarity, simply because we add two more characters to type. Just like sometimes we specifically want slot #3 in an array, sometimes we want the PATH entry out of %ENV. Single-quoting it add no clarity as far as I'm concerned.
The way Perl parses code makes it impossible to use other types of "bare words" in a hash index.
Try
$myhash{shift}
and you're only going to get the item stored in the hash under the 'shift' key, you have to do this
$myhash{shift()}
in order to specify that you want the first argument to index your hash.
In addition, I use jEdit, the ONLY visual editor (that I've seen--besides emacs) that allows you total control over highlighting. So it's doubly clear to me. Anything looking like the former gets KEYWORD3 ($myhash) + SYMBOL ({) + LITERAL2 (shift) + SYMBOL (}) if there is a paranthesis before the closing curly it gets KEYWORD3 + SYMBOL + KEYWORD1 + SYMBOL (()}). Plus I'll likely format it like this as well:
$myhash{ shift() }
Go with the quotes! They visually break up the syntax and more editors will support them in the syntax highlighting (hey, even Stack Overflow is highlighting the quote version). I'd also argue that you'd notice typos quicker with editors checking that you ended your quote.
It is better with quotes because it allows you to use special characters not permitted in barewords. By using quotes I can use the special characters of my mother tongue in hash keys.
I've wondered about this myself, especially when I found I've made some lapses:
use constant CONSTANT => 'something';
...
my %hash = ()
$hash{CONSTANT} = 'whoops!'; # Not what I intended
$hash{word-with-hyphens} = 'whoops!'; # wrong again
What I tend to do now is to apply quotes universally on a per-hash basis if at least one of the literal keys needs them; and use parentheses with constants:
$hash{CONSTANT()} = 'ugly, but what can you do?';
You can precede the key with a "-" (minus character) too, but be aware that this appends the "-" to the beginning of your key. From some of my code:
$args{-title} ||= "Intrig";
I use the single quote, double quote, and quoteless way too. All in the same program :-)
I've always used them without quotes but I would echo the use of strict and warnings as they pick out most of the common mistakes.