Pass command line parameters to perl via file? - perl

Could command lines parameters been saved to a file and then pass the file to perl to parse out the options? Like response file (prefix the name with #) for some Microsoft tools.

I am trying to pass expression to perl via command line, like perl -e 'print "\n"', and Windows command prompt makes using double quotes a little hard.
There are several solutions, from most to least preferable.
Write your program to a file
If your one liner is too big or complicated, write it to a file and run it. This avoids messing with shell escapes. You can reuse it and debug it and work in a real editor.
perl path\to\some_program
Command line options to perl can be put on the otherwise useless on Windows #! line. Here's an example.
#!/usr/bin/perl -i.bak -p
# -i.bak Backs up the file.
# -p Puts each line into $_ and writes out the new value of $_.
# So this changes all instances in a file of " with '.
s{"}{'}g;
Use alternative quote delimiters
Perl has a slew of alternative ways to write quotes. Use them instead. This is good for both one liners as well as things like q[<tag key='value'>].
perl -e "print qq[\n]"
Escape the quote
^ is the cmd.exe escape character. So ^" is treated as a literal quote.
perl -e "print ^"\n^""
Pretty yucky. I'd prefer using qq[] and reserve ^" for when you need to print a literal quote.
perl -e "print qq[^"\n]"
Use the ASCII code
The ASCII and UTF-8 hex code for " is 22. You can supply this to Perl with qq[\x22].
perl -e "print qq[\x22\n]"

You can read the file into a string and then use
use Getopt::Long qw(GetOptionsFromString);
$ret = GetOptionsFromString($string, ...);
to parse the options from that.

Related

Replacing Windows CRLF with Unix LF using Perl -- `Unrecognized switch: -g`?

Problem Background
We have several thousand large (10M<lines) text files of tabular data produced by a windows machine which we need to prepare for upload to a database.
We need to change the file encoding of these files from cp1252 to utf-8, replace any bare Unix LF sequences (i.e. \n) with spaces, then replace the DOS line end sequences ("CR-LF", i.e \r\n) with Unix line end sequences (i.e. \n).
The dos2unix utility is not available for this task.
We initially had a bash function that packaged these operations together using iconv and sed, with iconv doing the encoding and sed dealing with the LF/CRLF sequences. I'm trying to replace part of this bash function with a perl command.
Example Code
Based on some helpful code review, I want to change this function to a perl script.
The author of the code review suggested the following perl to replace CRLF (i.e. "\r\n") with LF ("\n").
perl -g -pe 's/(?<!\r)\n/ /g; s/\r\n/\n/g;'
The explanation for why this is better than what we had previously makes perfect sense, but this line fails for me with:
Unrecognized switch: -g (-h will show valid options).
More interestingly, the author of the code review also suggests it is possible to perform the decode/recode in a perl script, too, but I am completely unsure where to start.
Questions
Please can someone explain why the suggested answer fails with Unrecognized switch: -g (-h will show valid options).?
If it helps, the line is supposed to receive piped input from incov as follows (though I am interested in learning how to use perl to do the redcoding/recoding step, too):
iconv --from-code=CP1252 --to-code=UTF-8 $1$ | \
perl -g -pe 's/(?<!\r)\n/ /g; s/\r\n/\n/g;'
> "$2"
(Highly simplified) example input for testing:
apple|orange|\n|lemon\r\nrasperry|strawberry|mango|\n\r\n
Desired output:
apple|orange| |lemon\nrasperry|strawberry|mango| \n
Perl recently added the command line switch -g as an alias for 'gulp mode' in Perl v5.36.0.
This works in Perl version v5.36.0:
s=$(printf "Line 1\nStill Line 1\r\nLine 2\r\nLine 3\r\n")
perl -g -pe 's/(?<!\r)\n/ /g; s/\r\n/\n/g;' <<<"$s"
Prints:
Line 1 Still Line 1
Line 2
Line 3
But any version of perl earlier than v5.36.0, you would do:
perl -0777 -pe 's/(?<!\r)\n/ /g; s/\r\n/\n/g;' <<<"$s"
# same
BTW, the conversion you are looking for a way easier in this case with awk since it is close to the defaults.
Just do this:
awk -v RS="\r\n" '{gsub(/\n/," ")} 1' <<<"$s"
Line 1 Still Line 1
Line 2
Line 3
Or, if you have a file:
awk -v RS="\r\n" '{gsub(/\n/," ")} 1' file
This is superior to the posted perl solution since the file is processed record be record (each block of text separated by \r\n) versus having the read the entire file into memory.
(On Windows you may need to do awk -v RS="\r\n" -v ORS="\n" '...')
Another note:
You can get similar behavior from Perl by:
Setting the input record separator to the fixed string $/="\r\n" in a BEGIN block;
Use the -l switch so every line has the input record separator removed;
Use tr for speedy replacement of \n with ' ';
Possible set the output record separator, $/="\n", on Windows.
Full command:
perl -lpE 'BEGIN{$/="\r\n"} tr/\n/ /' file
The error message is about the command line switch -g you use in perl -g -pe .... This is not about the switch at the regex - which is valid (but useless since there is only a single \n in a line anyway, and -p reads line by line).
This switch simply does not exist with the perl version you are using. It was only added with perl 5.36, so you are likely using an older version. Try -0777 instead.

Remove all lines from a file that contain a non-ascii character

I have a file with thousands of lines in it. I would like to search each line for a non-ascii character, and if found, delete that entire line.
I found this bit of code in perl:
perl -i.bak -ne 'print unless(/[^[:ascii:]]/)' file
But I get this error when I run it with my file:
Can't find string terminator "'" anywhere before EOF at -e line 1.
Does anyone have any code for an actual perl script instead of a one liner like the above?
That's a shell error, most likely because you're on a windows machine.
Use double quotes instead of single quotes:
perl -i.bak -ne "print unless(/[^[:ascii:]]/)" file

Perl one-liner to remove trailing space from a file

I am calling a perl one-liner inside a Perl script.
The intention of the one-liner is to remove the trailing space from a file.
Inside the main perl script:
`perl -pi -e 's/\s+$//' tape.txt`;
Now it is throwing me an error Substitution replacement not terminated at -e line 2.
Where is it going wrong?
It's because of the $/ (special variable) inside your main perl script. Note that variables are interpolated inside `` strings just like inside "" strings, and the fact that there are some single quotes in there doesn't change that. You need to escape that $:
`perl -pi -e 's/\s+\$//' tape.txt;`
The backtick syntax invokes a shell and when invoked, the shell assumes it should interpolate the string passed.
A cleaner syntax might be:
system('perl -pli -e "s/\s*$//" tape.txt');
Since you aren't capturing the output of the command, using backticks or qx in lieu of system isn't an issue.
Too, adding the -l switch autochomps each line read and then adds a newline back --- probably what you want.
\s matches [ \t\n\r\f] and do not want to match \n.
Notice use of {} for subst delimiters:
$ echo -e 'hi \nbye'| perl -pe 's{[\t\040]+$}{};' | cat -A
hi$
bye$

Using perl in windows oddities

I've got windows bat file, (using activeperl)
#echo off
perl -p -e 's/DIV\[/div\[/g' orginal.txt > output.txt
perl -p -e 'rename("output.txt", "orginal.txt")';
...
Running a .bat file, and I just cant get it to run properly..
Can't open ;: No such file or directory, <> line 12248.
Can't find string terminator "'" anywhere before EOF at -e line 1.
Not sure what I'm doing wrong..
You can't use single-quotes to enclose the Perl code in Windows. As a result, you need to escape the double-quotes or find other alternatives, such as qq(...).
perl -pe "s/DIV\[/div\[/g" original.txt > output.txt
perl -pe "rename(qq(output.txt), qq/original.txt/)"
Note that in this case, the arguments to rename can simply be rename('a.txt', 'b.txt') since they are literals and no variable interpolation is required.
You ought to use double quotes to quote the program text under Windows cmd. In your example, you can just swiztch double and single quotes. In cases where you really need double quotes in the perl text, use qq{ .... } instead.
The other posters are correct: windows requires double quotes for -e scripts to perl, which often screws things up. There is one more thing you can do, though: Use the -i switch, like this:
#echo off
perl -pi.bak -we "s/DIV\[/div\[/g" original.txt
The -i.bak switch will edit the file in place - no rename required - AND it will store a backup of the file in "original.txt.bak". If you do not want a backup, remove the ".bak" part and just use -pi.

How do I protect quotes in a batch file?

I want to wrap a Perl one-liner in a batch file. For a (trivial) example, in a Unix shell, I could quote up a command like this:
perl -e 'print localtime() . "\n"'
But DOS chokes on that with this helpful error message:
Can't find string terminator "'" anywhere before EOF at -e line 1.
What's the best way to do this within a .bat file?
For Perl stuff on Windows, I try to use the generalized quoting as much as possible so I don't get leaning toothpick syndrome. I save the quotes for the stuff that DOS needs:
perl -e "print scalar localtime() . qq(\n)"
If you just need a newline at the end of the print, you can let the -l switch do that for you:
perl -le "print scalar localtime()"
For other cool things you can do with switches, see the perlrun documentation.
In Windows' "DOS prompt" (cmd.exe) you need to use double quotes not single quotes. For inside the quoted Perl code, Perl gives you a lot of options. Three are:
perl -e "print localtime() . qq(\n)"
perl -e "print localtime() . $/"
perl -le "print ''.localtime()"
If you have Perl 5.10 or newer:
perl -E "say scalar localtime()"
Thanks to J.F. Sebastian's comment.
For general batch files under Windows NT+, the ^ character escapes lots of things (<>|&), but for quotes, doubling them works wonders:
C:\>perl -e "print localtime() . ""\n"""
Thu Oct 2 09:17:32 2008
First, any answer you get to this is command-specific, because the DOS shell doesn't parse the command-line like a uniq one does; it passes the entire unparsed string to the command, which does any splitting. That said, if using /subsystem:console the C runtime provides splitting before calling main(), and most commands use this.
If an application is using this splitting, the way you type a literal double-quote is by doubling it. So you'd do
perl -e "print localtime() . ""\n"""
In DOS, you use the "" around your Perl command. The DOS shell doesn't do single quotes like the normal Unix shell:
perl -e "print localtime();"