exiv2 UserComment metatag - metadata

What is the difference between the following statements for setting a comment string
exiv2 -c tera img.JPG
exiv2 -M"set Exif.Photo.UserComment adagio" img.JPG
I can access them with
$ exiv2 -p c img.JPG
tera
$ exiv2 -p S img.JPG | grep adagio
450 | 0x9286 UserComment | UNDEFINED | 14 | 38546 | ........adagio
What would be the proper way to add simple ASCII characters that won't longer than a dozen of characters.

The first command saves the text to the jpeg COM block (see Jpeg Syntax and structure). This is a jpeg only piece of metadata.
The second command saves the text to the EXIF UserComment tag. This is part of the EXIF standard of metadata.
The jpeg COM comment is a fairly fragile place to put metadata, as some programs will either not save it or overwrite it with their own text. The UserComment is less likely to be lost or overwritten by most programs.

Related

use crunch to generate all the possible IATA codes comsbination

I don't really know how to formulate this, but I have a bunch of IATA codes, and I want to generate all the possible combinations ex : JFK/LAX, BOS/JFK, ...etc, separated by a character such as "/" or "|".
Here we assume your IATA codes are stored in the file file; one code per line.
crunch has the -q option which generates permutations of lines from a file. However, in this mode crunch ignores most of the other options like <max-len>, which would be important here to print only pairs of codes.
Therefore, it would be easier and faster to …
Use something different than crunch
For instance, try
join -j2 -t/ -o 1.1 2.1 file file | awk -F/ '$1!=$2'
If you really, really, really want, you can …
Translate the input into something crunch can work with
We translate each line from file to a unique single character, supply that list of characters to crunch, and then translate the result back.
crunch supports Unicode characters, so files with more than 255 lines are totally fine. Here we enumerate the lines in file by characters in Unicode's Supplementary Private Use Area-A. Therefore, file may have at most 65'534 lines.
If you need more lines, you could combine multiple Unicode planes, but at some point you might run into ARG_MAX issues. Also, with 65'534 lines you would already generate (a bit less than) 65'534^2 = 4'294'705'156 pairs, occupying more than 34 GB when translated into pairs of IATA codes.
I suspect the back-translation to be a huge slowdown, so above alternative seems to be better in every aspect (efficiency, brevity, maintainability, …).
# This assumes your locale is using any Unicode encoding,
# e.g. UTF-8, UTF-16, … (doesn't matter which one).
file=...
((offset=0xF0000))
charset=$(
echo -en "$(bc <<< "obase=16;
max=$offset+$(wc -l < "$file");
for(i=$offset;i<max;i++) {\"\U\"; i}" |
tr -d \\n
)"
)
crunch 2 2 "$charset" -d 1# --stdout |
iconv -t UTF-32 |
od -j4 -tu4 -An -w12 -v |
awk -v o="$offset" 'NR==FNR{a[o+NR-1]=$0;next} {print a[$1]"/"a[$2]}' "$file" -

Use processed output from stdin as a replacement string in Sed

Following command gives me the output I want:
$ sed '/^<template.*>/,/<\/template>/!d;//d' src/components/**/*.vue | html2jade
in that it processes each template containing html into it's pug equivalent.
Would it be possible now to somehow replace the originally found html in all those files, with this now
processed output? There is also some other content outside template tags, which should stay as it is,
namely some script and style tags.

How to get out of PowerShell's encoding hell?

> cat .\foo.txt
abc
> cat .\foo.txt | md5sum
c13b6afecf97ea6b38d21a8f5167fa1e *-
> md5sum foo.txt
b79545611b3be30f90a0d21ef69bca82 *foo.txt
cat and md5sum are the unix port (from the Windows Git distribution).
This is a toy example for my real use case which is piping of a binary data to a legacy python script that I can't change. Because of the pipe doing encoding, the binary file becomes corrupted.
I tried changing $OutputEncoding, [Console]::OutputEncoding and using chcp, all didn't help (but maybe I was not doing it right, this is all very convoluted...).
The utility in PowerShell's pipe adds linefeed doesn't work for me because of how it handles the process arguments (I need to pass some argument to the legacy script and some need to be quoted, but the utility accepts all arguments as one string)
The optimal solution for me to somehow tell powershell to turn off encoding completely and just behave as unix/cmd.
There is no way around it, except to use cmd to run the commands including the pipe:
cmd /c cat.exe .\foo.txt "|" md5sum
Note the pipe character is quoted, so it is interpreted by cmd and not powershell.
If you're using the Get-Content cmdlet, then follow the recommendation given at https://technet.microsoft.com/en-us/library/hh847788.aspx for dealing with binary data:
When reading from and writing to binary files, use a value of Byte for the Encoding dynamic parameter and a value of 0 for the ReadCount parameter.
Regardless of whether or not you're using Get-Content, you'll probably want to avoid ever having your data represented as a String. The String type is designed for character data, and doesn't do well for handling binary data.

Using grep to correct XML files

I have a folder and sub folder that contains 2000 xml files.
Need to process all the files with BizTalk systems.
But some of the files has wrong tags
streetName Bombay Crescent /addressRegion
/streetName.
I need to you grep to find and replace the worng tags only.
I.e with the for loop.. find any xml file with wrong tag and replace it.
Only the tag "streetName" is affected. And only "addressRegion" is in the wrong place.
will like to do
grep -Po where streetName and *** /addressRegion if the condition is true
replace /addressRegion with /streetName
Thanks in Advance
The following will look for a tag <streetName> that with a matching closing tag of </addressRegion>, and will change addressRegion to streetName. It will replace all occurrences on the line. The street name must not contain any < signs, that would break the matching.
sed -e 's:\(<streetName>[^<]*\)</addressRegion>:\1</streetName>:g'
The command reads its standard input and writes standard output.
Sed -i will do the replacement in-place in all its input files:
sed -i -e 's:\(<streetName>[^<]*\)</addressRegion>:\1</streetName>:g' folder/subfolder/*.xml

How to assign number for a repeating pattern

I am doing some calculations using gaussian. From the gaussian output file, I need to extract the input structure information. The output file contains more than 800 structure coordinates. What I did so far is, collect all the input coordinates using some combinations of the grep, awk and sed commands, like so:
grep -A 7 "Input orientation:" test.log | grep -A 5 "C" | awk '/C/{print "structure number"}1' | sed '/--/d' > test.out
This helped me to grep all the input coordinates and insert a line with "structure number". So now I have a file that contains a pattern which is being repeated in a regular fashion. The file is like the following:
structure Number
4.176801 -0.044096 2.253823
2.994556 0.097622 2.356678
5.060174 -0.115257 3.342200
structure Number
4.180919 -0.044664 2.251182
3.002927 0.098946 2.359346
5.037811 -0.103410 3.389953
Here, "Structure number" is being repeated. I want to write a number like "structure number:1", "structure number 2" in increasing order.
How can I solve this problem?
Thanks for your help in advance.
I am not familiar at all with a program called gaussian, so I have no clue what the original input looked like. If someone posts an example I might be able to give an even shorter solution.
However, as far as I got it the OP is contented with the output of his/her code besided that he/she wants to append an increasing number to the lines inserted with awk.
This can be achieved with the following line (adjusting the OP's code):
grep -A 7 "Input orientation:" test.log | grep -A 5 "C" | awk '/C/{print "structure number"++i}1' | sed '/--/d' > test.out
Addendum:
Even without knowing the actual input, I am sure that one can at least get rid of the sed command leaving that piece of work to awk. Also, there is no need to quote a single character grep pattern:
grep -A 7 "Input orientation:" test.log | grep -A 5 C | awk '/C/{print "structure number"++i}!/--/' > test.out
I am not sure since I cannot test, but it should be possible to let awk do the grep's work, too. As a first guess I would try the following:
awk '/Input orientation:/{li=7}!li{next}{--li}/C/{print "structure number"++i;lc=5}!lc{next}{--lc}!/--/' test.log > test.out
While this might be a little bit longer in code it is an awk-only solution doing all the work in one process. If I had input to test with, I might come up with a shorter solution.