Checking if a file is a text file without using -T? - perl

Title is pretty self explanatory, are there file testing functions in perl or is there a built in module that allows file testing operations?

This is a non-issue as -T like all of the file test operators are perl builtins.
They are documented here: perldoc -X
-X FILEHANDLE
-X EXPR
-X DIRHANDLE
-X
A file test, where X is one of the letters listed below. This unary operator takes one argument, either a filename, a filehandle, or a dirhandle, and tests the associated file to see if something is true about it. If the argument is omitted, tests $_ , except for -t , which tests STDIN. Unless otherwise documented, it returns 1 for true and '' for false, or the undefined value if the file doesn't exist. Despite the funny names, precedence is the same as any other named unary operator.
...
-T File is an ASCII text file (heuristic guess).
-B File is a "binary" file (opposite of -T).

The "file test" functions available in Perl are part of the programming language itself. Based on what you're saying and from the comments on this page, it may be that you have been "asked not to use external commands" because someone thinks that the -T flag is relying on something that belongs to the underlying environment and not the Perl language.
-T is part of the -X file test unary operators which are inherent to Perl:
http://perldoc.perl.org/functions/-X.html
Underlying the -T operator (specifically) is the function pp_fttext, which lives in pp_sys.c. These are part of the underlying code that comprises Perl, and you can verify this by looking in the root directory of the Perl source distribution:
http://www.perl.org/get.html
It may be the only way to do what you were originally asking (how to do this without -T) might be to do what you were asked not to do (use something external to Perl to perform the test).

Related

How can I enable tab-completion for `#` path options to HTTPie in fish?

HTTPie accepts paths as arguments with options that include the # sign. Unfortunately, they don't seem to work with shell completions in fish. Instead, the option is treated as an opaque string.
To stick with the file upload example from the HTTPie documentation with a file at ~/files/data.xml, I would expect to be able to tab complete the file name when typing:
http -f POST pie.dev/post name='John Smith' cv#~/files/da<TAB>
However, no completion is offered.
I have installed the completions for fish from the HTTPie project and they work for short and long arguments. This file does not specify how to complete the # arguments though.
In addition, I looked into specifying my own completions but I am not able to find a way of getting to work file completions with the arbitrary prefix.
How could I implement a completion for these path arguments for HTTPie?
Currently, the fish completions for HTTPie do not have completion for file path arguments with #. There is a more general GitHub Issue open about this.
If this is something you'd like to work on, either for yourself or for the project, you might be able draw some inspiration for the fish implementation from an HTTPie plugin for zsh+ohmyzsh that achieves your desired behaviour.
I managed to get the tab completion of the path arguments working with some caveats.
This adds the completion:
complete -c http --condition "__is_httpie_path_argument" -a "(__complete_httpie_path_argument (commandline -t))"
With the following functions:
function __is_httpie_path_argument
set -l arg (commandline -t)
__match_httpie_path_argument --quiet -- $arg
end
function __match_httpie_path_argument
string match --entire --regex '^([^#:=]*)(#|=#|:=#)(.*)$' $argv
end
function __complete_httpie_path_argument
__complete_httpie_path_argument_helper (__match_httpie_path_argument -- $argv[1])
end
function __complete_httpie_path_argument_helper
set -l arg $argv[1]
set -l field $argv[2]
set -l operator $argv[3]
set -l path $argv[4]
string collect $field$operator(__fish_complete_path $path)
end
The caveat is that this does not expand any variables nor the tilde ~. It essentially only works for plain paths — relative or absolute.

Perl -M will not find a hardcoded path to a module

Background
Inside a chsell script, I am invoking a subroutine from a perl module and saving its result to a variable in the following manner:
set result =`perl -M/some/hard/coded/path/lib.pm=theFunction -e 'theFunction( $A_VARIABLE_ARGUMENT )'`
Despite the fact that I explicitly specify the module, my script throws this error:
Module name required with -M option
Question
How do I invoke a hardcoded module with perl's -M option?
You cannot, as the -M option is translated to a use statement which takes only module names, not paths. However, you can add the path to be the first module search path using the -I option. Modules are searched relative to each search path by translating them like Foo::Bar -> Foo/Bar.pm.
perl -I/home/hard/coded/path -Mlib=theFunction
As a note, you should definitely not call your module or package lib, because this is an important core module (in fact, it's what -I is using here).

How can I ensure my autocompleted spaces are fed into my function properly?

I'm using zsh, and am trying to write a function to operate on a URL and a pathname:
function my-function
{
somecommand --url $1 $(readlink -f $2)
}
(to complicate things somewhat, the function actually uses sh syntax, as it is sourced from my ~/.zshrc using a trick like this). The readlink is there to expand symlinks and ensure directories such as . are evaluated correctly (the directory name is stored for later use by somecommand).
When I type a command from the command-line like this:
my-function http://example.org/example /tmp/myexampledirectory
... it works fine, even if I autocomplete the directory name. However, if the directory name contains spaces, zsh completes it like this:
my-function http://example.org/example /tmp/My\ Example\ Directory
For most "normal" commands (cp, mv, etc.) that never seems to cause a problem. However, in my case, somecommand sees $2 as only being /tmp/My - presumably the rest is seen as another argument.
How can I avoid this situation? I would prefer not to alter the standard zsh autocompletion, but rather find a way for my function to handle this.
The zsh completion system works very well here, and the solution is very simple, just put double-quotes around the readlink argument in the script:
somecommand --url $1 $(readlink -f "$2")
The point is that without quotes readlink removes backslashes which escape whitespaces. Compare three results:
1. Without backslashes and quotes readlink -f assumes that there are three different files/directories (with default path in current directory) and produces
$ readlink -f /tmp/My Example Directory
/tmp/My
/home/jimmij/Example
/home/jimmij/Directory
2. With escaping backslashes but without quotes readlink -f understands that there is only one directory, but removes backslashes from output, so that somecommand takes three separate arguments
$ readlink -f /tmp/My\ Example\ Directory
/tmp/My Example Directory
3. With backslashes and with double-quotes readlink -f gives the output with backslashes what is (most probably) expected by somecommand
$ readlink -f "/tmp/My\ Example\ Directory"
/tmp/My\ Example\ Directory
BTW, as a rule of thumb: if there are any problems with whitespaces in the shell-like scripts (bash, zsh, whatever) the first thing to play with is different quotation marks around variables.

Curl command uploading document fails when run from Perl

I've got a Perl script that uploads documents into Alfresco using curl.
Some of the documents have ampersand in the file name and initially this caused curl to fail. I fixed this by placing a carat symbol in front of the ampersand. But now I'm finding some documents are failing to upload when they don't have a space either side of the ampersand. Other documents with spaces in the file name and an ampersand do load successfully.
The snippet of Perl code that is running is:
# Escape & for curl in file name with a ^
my $downloadFileNameEsc = ${downloadfile};
$downloadFileNameEsc =~ s/&/^&/g;
$command = "curl -u admin:admin -F file=\#${downloadFileNameEsc} -F id=\"${docId}\" -F title=\"${docTitle}\" -F tags=\"$catTagStr\" -F abstract=\"${abstract}\" -F published=\"${publishedDate}\" -F pubId=\"${pubId}\" -F pubName=\"${pubName}\" -F modified=\"${modifiedDate}\" -F archived=\"${archived}\" -F expiry=\"${expiryDate}\" -F groupIds=\"${groupIdStr}\" -F groupNames=\"${groupNameStr}\" ${docLoadUrl}";
logmsg(4, $command);
my #cmdOutput = `$command`;
$exitStatus = $?;
my $upload = 0;
logmsg(4, "Alfresco upload status $exitStatus");
if ($exitStatus != 0) {
You can see that I am using backticks to execute the curl command so that I can read the response. The perl script is being run under windows.
What this effectively tries to run is:
curl -u admin:admin -F file=#tmp-download/Multiple%20Trusts%20Gift%20^&%20Loan.pdf -F id="e2ef104d-b4be-4896-8360-7d6f2e7c7b72" ....
This works.
curl -u admin:admin -F file=#tmp-download/Quarterly_Buys^&sells_Q1_2006.doc -F id="78d18634-ee93-4c29-b01d-270aeee3219a" ....
This fails!!
The only difference being as far as I can see is that in the one that works the file name has spaces (%26) in the file name somewhere around the ampersand, not necessarily next to the ampersand.
I can't see why one runs successfully and the other doesn't. Think it must be to do with backticks and ampersands in the file name. I haven't tried using system as I wanted to capture the response.
Any thoughts because I've exhausted all options.
You should learn to use Perl modules. Perl has some great modules to handle the Web requests. If you depend upon operating system commands, you will end up with not only dependencies upon those commands, but shell interactions and whether or not you need to quote special characters.
Perl modules remove a lot of the issues that you can run into. You are no longer dependent upon particular commands or even particular implementation of those commands. (The curl command can vary from system to system, and may not even be on the system you're on). Plus, most of these modules handle the piddling details for you (such as URI escaping strings).
LWP is the standard Perl library for implementing these requests. Take a look at the LWP Cookbook. This is a tutorial on the whole HTTP process. Basically, you need to create an agent which is really just a virtual web browser for you to use. Then, you can configure it (for example, setting the machine, browser type, etc.) you might need.
What is really nice is HTTP::Request::Common that provides a simple interface for using HTTP forms.
my $results = POST "$docLoadUrl"
[ file => '#' . "$downloadFileName",
id => $docId,
title => $docTitle,
tag => $catTagStr,
abstract => $abstract,
published => $publishedDate,
pubId => $pubId,
pubName => $pubName,
...
];
This is a lot easier to read and maintain. Plus, it will handle URI encoding for you.

Rename multiple files from command line [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Renaming lots of files in Linux according to a pattern
I have multiple files in this format:
file_1.pdf
file_2.pdf
...
file_100.pdf
My question is how can I rename all files, that look like this:
file_001.pdf
file_002.pdf
...
file_100.pdf
I know you can rename multiple files with 'rename', but I don't know how to do this in this case.
You can do this using the Perl tool rename from the shell prompt. (There are other tools with the same name which may or may not be able to do this, so be careful.)
rename 's/(\d+)/sprintf("%03d", $1)/e' *.pdf
If you want to do a dry run to make sure you don't clobber any files, add the -n switch to the command.
note
If you run the following command (linux)
$ file $(readlink -f $(type -p rename))
and you have a result like
.../rename: Perl script, ASCII text executable
then this seems to be the right tool =)
This seems to be the default rename command on Ubuntu.
To make it the default on Debian and derivative like Ubuntu :
sudo update-alternatives --set rename /path/to/rename
Explanations
s/// is the base substitution expression : s/to_replace/replaced/, check perldoc perlre
(\d+) capture with () at least one integer : \d or more : + in $1
sprintf("%03d", $1) sprintf is like printf, but not used to print but to format a string with the same syntax. %03d is for zero padding, and $1 is the captured string. Check perldoc -f sprintf
the later perl's function is permited because of the e modifier at the end of the expression
If you want to do it with pure bash:
for f in file_*.pdf; do x="${f##*_}"; echo mv "$f" "${f%_*}$(printf '_%03d.pdf' "${x%.pdf}")"; done
(note the debugging echo)