How do find the version of a command I am using? - command

Background
I am attempting to use a command, called xmllint to parse an html file for a specific value inside a tag. All of the examples I have seen online use the --html option alongside the --xpath option in order to parse like in #nwellnhof's example:
xmllint --html --xpath '/html/body/h1[1]' - <<EOF
<BODY>
<H1>Dublin</H1>
EOF
However, my local version of xmllint does not contain the --xpath option. I would like to figure out which version of the command I am using so I can parse html properly.
Question
How do I find which version of a command that I am using in linux?

Related

Exiftool: Want to output to one text file using -w command

I'm currently trying to use exiftool on Windows command prompt to read meta data from multiple files, then output to a single text file.
The exact command I last tried looked like this:
exiftool.exe -FileName -GPSPosition -CreateDate -d "%m:%d:%Y %H:%M:%S" -c "%d° %d' %.2f"\" -charset UTF-8 -ext jpg -w _Coordinate_Date.txt S:\Nick\Test\
When I run this, I get 7 individual text files with the content for one corresponding file in each of them. However, I simply want to output all of it to one single text file. Any help is greatly appreciated
The -w (textout) option can only be used to write multiple files. It is not meant to be used to output to a single file. As per the docs on -w:
It is not possible to specify a simple filename as an argument -- creating a single output file from multiple source files is typically done by shell redirection
Which is what you're doing with the >> ./output.txt part of your command. The -w _Coordinate_Date.txt isn't doing anything and I would think throw an Invalid TAG name: "w _Coordinate_Date.txt" error if quoted together like that as it gets treated as a single arugment. The -w option requires two arguments, the -w and either an extension or a format string.
I actually figured it out, if you wrap the entire -w _Coordinate_Date.txt command in quotations and append it to a file, you can throw all of the output into one text file.
i.e. "-w _Coordinate_Date.txt >> ./output.txt"

OracleSolaris 11.2 - ctags and vi

On a freshly installed OracleSolaris I have ctags from base-developer-utilities package. It doesn't support recursive, so I generate tags as follows:
% cd my_sources; rm -f tags; touch tags
% find . -name '*.c' -o -name '*.h' -exec ctags -v -u {} \;
The tags get generated, but for some reason vim is unable to understand it, i.e. it just doesn't see any tags although I added them with set tags, instead reports error E426: tag not found.
The tag is in tags file.
Does anybody have a clue what possibly can be wrong with it? Thanks.
If vi complains that the tag isn't there, then it's because it probably isn't. You could confirm that by opening the tags file with a text editor and search for it.
But the reason why it isn't there is because you are overwriting the contents of the tags file for each file find encounters, so it should only contain the tags of the last file. To overcome this you can add the -a argument, which is available according to its man page.
As an alternative you can try compiling a more recent ctags from source in order to use the recursive mode with the -R --languages=c arguments. If you decide to compile from source, I suggest that you use universal-ctags.

Extract metadata for package component files using Tika

I am trying to extract metadata for package component files using Tika at the command line, but I can only seem to get it to output metadata for the containing package file. Example: test_file.zip contains two files, test1.doc and test2.doc. I want to get the metadata for test1.doc and test2.doc, but cannot figure out how to do so.
I tried to run this:
java -jar tika-app-1.5.jar -m test_files.zip
but that just outputted the Content-Length, Content-Type, and resourceName for test_files.zip.
I also tried to run this:
java -jar tika-app-1.5.jar -h test_files.zip
That outputted the HTML for each component file, wrapped in a <div> with class ."package-entry", but the metadata tags were again outputted only for the containing package file test_files.zip. I tried using the -x parameter instead of -h, and no parameter at all, and got exactly the same result.
How do I get the metadata for the component files? I don't mind parsing the embedded metadata from xhtml but I cannot figure how to get it injected into the xhtml or otherwise outputted.
Any help much appreciated. Thank you.
Since you've said you want to do it with only the tika-app jar, your best option is something like
# Create a temp directory
cd /tmp
mkdir tika-extracted
cd tika-extracted
# Have Tika extract out all the embedded resources
java -jar tika-app-1.5.jar --extract $INPUT
# Process each one in turn
for e in *; do
java -jar tika-app-1.5.jar --metadata $e
done
# Tidy up
cd /tmp
rm -rf tika-extracted
Using Java, you'd be able to register your own EmbeddedDocumentExtractor on the ParserContext, and use that to trigger the metadata extraction for each one individually

How do I get the raw version of a gist from github?

I need to load a shell script from a raw gist but I can't find a way to get raw URL.
curl -L address-to-raw-gist.sh | bash
And yet there is, look for the raw button (on the top-right of the source code).
The raw URL should look like this:
https://gist.githubusercontent.com/{user}/{gist_hash}/raw/{commit_hash}/{file}
Note: it is possible to get the latest version by omitting the {commit_hash} part, as shown below:
https://gist.githubusercontent.com/{user}/{gist_hash}/raw/{file}
February 2014: the raw url just changed.
See "Gist raw file URI change":
The raw host for all Gist files is changing immediately.
This change was made to further isolate user content from trusted GitHub applications.
The new host is
https://gist.githubusercontent.com.
Existing URIs will redirect to the new host.
Before it was https://gist.github.com/<username>/<gist-id>/raw/...
Now it is https://gist.githubusercontent.com/<username>/<gist-id>/raw/...
For instance:
https://gist.githubusercontent.com/VonC/9184693/raw/30d74d258442c7c65512eafab474568dd706c430/testNewGist
KrisWebDev adds in the comments:
If you want the last version of a Gist document, just remove the <commit>/ from URL
https://gist.githubusercontent.com/VonC/9184693/raw/testNewGist
One can simply use the github api.
https://api.github.com/gists/$GIST_ID
Reference: https://miguelpiedrafita.com/github-gists
Gitlab snippets provide short concise urls, are easy to create and goes well with the command line.
Sample example: Enable bash completion by patching /etc/bash.bashrc
sudo su -
(curl -s https://gitlab.com/snippets/21846/raw && echo) | patch -s /etc/bash.bashrc

CVS command to get brief history of repository

I am using following command to get a brief history of the CVS repository.
cvs -d :pserver:*User*:*Password*#*Repo* rlog -N -d "*StartDate* < *EndDate*" *Module*
This works just fine except for one small problem. It lists all tags created on each file in that repository. I want the tag info, but I only want the tags that are created in the date range specified. How do I change this command to do that.
I don't see a way to do that natively with the rlog command. Faced with this problem, I would write a Perl script to parse the output of the command, correlate the tags to the date range that I want and print them.
Another solution would be to parse the ,v files directly, but I haven't found any robust libraries for doing that. I prefer Perl for that type of task, and the parsing modules don't seem to be very high quality.