Using variable in exiftool -if condition - exiftool

I am trying to reorganise images based on keywords that are found in the IPTC metadata. More specifically, I need sort images into directories based on the species name in the subject pseudo tag of exiftool.
To do this, I have compiled the keywords in a .txt file (species_ls.txt), with each keyword on a new line, as such:
Asian Tortoise
Banded Civet
Banded Linsang
...
To sort the images I have created the following for loop, which iterates through each line of the document, with sed pulling out the keyword. Here, $line_no is the number of lines in the species_ls.txt file, and image_raw is the directory containing the images.
for i in 'seq 1 $line_no'; do
sp_name=$(sed -n "${i}p" < species_ls.txt)
exiftool -r -if '$subject=~/${sp_name}/i' \
'-Filename=./${sp_dir}/%f%+c%E' image_raw`
Although the for loop runs, no conditions are being met in the -if flag in exiftool. I am assuming this is because the variable sp_name is not being passed into the condition properly.
Any suggestions, or a better way of doing this, would be appreciated.

For the line with condition, rather than using single quotes (' '), it would be better to use double quotes (" ").
The single quotes mean that the content is passed literally, meaning your variable won't get expanded.
To overcome, the $subject line expanding (which I presume you don't want), you can just put a \ in front of the $ to escape it being read as a variable.
This line should now look like:
exiftool -r -if "\$subject=~/${sp_name}/i"
Hope this helps you!

Related

Using sed/awk to print ONLY words that contains matched pattern - Words starting with /pattern/ or Ending with /pattern/

I have the following output:
junos-vmx-x86-64-21.1R1.11.qcow2 metadata-usb-fpc0.img metadata-usb-fpc10.img
metadata-usb-fpc11.img metadata-usb-fpc1.img metadata-usb-fpc2.img metadata-usb-fpc3.img
metadata-usb-fpc4.img metadata-usb-fpc5.img metadata-usb-fpc6.img metadata-usb-fpc7.img
metadata-usb-fpc8.img metadata-usb-fpc9.img metadata-usb-re0.img metadata-usb-re1.img
metadata-usb-re.img metadata-usb-service-pic-10g.img metadata-usb-service-pic-2g.img
metadata-usb-service-pic-4g.img vFPC-20210211.img vmxhdd.img
The output came from the following script:
images_fld=$(for i in $(ls "$DIRNAME_IMG"); do echo ${i%%/}; done)
The previous output is saved in a variable called images_fld=
Problem:
I need to extract the values of junos-vmx-x86-64-21.1R1.11.qcow2
vFPC-20210211.img and vmxhdd.img When I mean values I mean the entire word
The problem is that this directory containing all the files is always being updated, and new files are added constantly, which means that I can not rely on the line number ($N) to extract the name of those files.
I am trying to use awk or sed to achieve this.
Is there a way to:
match all files ending with.qcow2 and then extract the full file name? Like: junos-vmx-x86-64-21.1R1.11.qcow2
match all files starting withvFPC and then extract the full file name? Like: vFPC-20210211.img
match all files starting withvmxhdd and then extract the full file name? Like: vmxhdd.img
I am using those patterns as those file names tend to change names according to each version I am deploying. But the patterns like: .qcow2 or vFPC or vmxhddalways remain the same regardless, so for that reason, I need to extract the entire string only by matching partial patterns. Is it possible? Thanks!
Note: I can not rely on files ending with .img as there are quite a lot of them, so it would make it more difficult to extract the specific file names :/
This might work for you (GNU sed):
sed -nE '/\<\S+\.qcow2\>|\<(vFPC|vmxhdd)\S+\>/{s//\n&\n/;s/[^\n]*\n//;P;D}' file
If a string matches the required criteria, delimit it by newlines.
Delete up to and including the first newline.
Print/delete the first line and repeat.
Thanks to KamilCuk I was able to solve the problem. Thank you! For anyone who may need this in the future, instead of using sed or awk the solution was by using tail.
echo $images_fld | tail -f | tr ' ' '\n' | grep '\.qcow2$\|vFPC\|vmxhdd')
Basically, the problem that I was having was only to extract the names of the files ending with .qcow2 | and starting with vFPC & vmxhdd
Thank you KamilCuk
Another solution given by potong is by using
echo $images_fld sed -nE '/\<\S+\.qcow2\>|\<(vFPC|vmxhdd)\S+\>/{s//\n&\n/;s/[^\n]*\n//;P;D}'
which gives the same output as KamilCuk's! Thanks both

rename multiple files possible unintended interpolation

I'm using brew rename to rename multiple files...
file-24.png => file.png
file-48.png => file#2x.png
file-72.png => file#3x.png
the first one is succeed with,,
rename 's/-24//g' *
the second and third...
rename 's/-48/#2x/g' *
and getting Possible unintended interpolation of #2 in string at (eval 2) line 1...
escaping doesnt work..
rename 's/-48/\#2x/g' *
other possible ways to rename multiple files like this case are also welcome..
I don't know what "brew rename" is, but if it uses normal regex
's/pattern/q(#replacement)/e'
This uses /e modifier to evaluate the replacement side as code, where q() operator (single quotes) is used to insert literal characters.
Another way is to use \x40 for # character
's/pattern/\x40replacement/'
or just escape it, use \# in the replacement.
This is suitable for when there's just one character to deal with, like here. if there's more than that then it's easier to single-quote the whole thing, with q() (for which we need /e flag).
Can't help it but ask -- are you certain that you want to have # in a file name? That character gets interpreted in various ways by many tools. For instance, sticking that file name in a variable in a Perl script leads to no end of trouble. Why not even simply file_at_2x.png?
This may be more of a curiousity, but if you have a lot of files you can rename them all with
's{ \-([0-9]+) }{ ($r = $1/24) > 1 && qq(_at_${r}x) || q() }ex'
This captures the number ([0-9]+) into $1. Then, it finds the ratio ($r = $1/24) and if that is >1 then (&& short-circuits) it replaces -number with _at_${r}x, otherwise (||) removes it by putting an empty string, q().
I use {}{} delimiters so that I may use / inside, and }x allows spaces inside, for readability.
Please test this carefully with (a copy of) your actual files, as always.
I know this question is old and maybe the version of rename that apt-get installs is lightly different or improved. However, escaping with a single backslash seems to work:
$ rename -n -v 's/-48/\#2x/g' *
rename(foo-48.txt, foo#2x.txt)

How to use sed to isolate only the first part of a file

I'm running Windows and have the GnuWin32 toolkit, which includes sed. Specifically:
C:\TEMP>sed --version
GNU sed version 4.2.1
I have a text file with two sections: A fixed part I want to preserve, and a part that's appended after running a job.
In the file is a unique string that identifies the start of the part that's added, and I'd like to use Gnu sed to isolate only the part of the file that's before the unique string - i.e., so I can append different data to the fixed part each time the job is run.
I know I could keep the fixed portion in a separate file, but that adds complexity and it would be more elegant if I could just reuse the data at the start of the same file.
A long time ago I knew how to set up sed scripts, and I'm sure this can be done with sed, but I've slept since then. :)
Can you please describe how to use sed to display the lines of text in a file up to and not including a specific string?
Example:
line 1 of fixed portion
line 2 of fixed portion
unique string
line 1 of appended portion
line 2 of appended portion
line 3 of appended portion
What I'd like is to see as output:
line 1 of fixed portion
line 2 of fixed portion
I've gotten as far as:
sed -r -n -e "0,/unique string/p"
but that prints the unique string as well.
Thanks in advance.
-Noel
This should work for you:
sed -n '/unique string/q;p' file
It quits processing at unique string. Other lines get printed.
An alternative might be to use a range address like this:
sed -n '1,/unique string/{/unique string/!p}' file
Note that sed includes the range border. We need to exclude unique string from printing.
Furthermore I'm using the -n option which makes sed suppress the output of input lines by default.
One thing, if unique string can contain characters which are also syntax characters in the regex like ...
test*
... sed might not be the right tool for the job any more since it can only match regular expressions but not fixed strings.
In that case awk might be the tool of choice:
awk 'index("*unique string*"){exit}1' file
index("string") returns a non zero value (the position) if the string has been found. We cancel further processing of input lines in that case and don't print that line as well.
The trailing 1 always evaluates to true and makes awk print all the lines until the previous condition applies.

How can I use sed to put spaces in between consecutive characters?

I have been trying to write a bash script that converts csv files into confluence tables.
I would like a sed command (or several) that converts:
one,two,,three
into
|one|two| |three|
Note that it needs a space when there is no data.
I have been struggling to find anything that works.
Here's an example:
# first, replace comma with pipe
y/,/|/
# loop, replacing consecutive pipes
:loop
s/||/| |/
tloop
Alternatively, you should be able to s/||/| |/g twice, as all the || missed by the first one (due to the start overlapping the end of the previous substitution) will be caught by the second.

Remove some rows with " in front

I have a CSV file that is causing me serious headaches going into Tableau. Some of the rows in the CSV are wrapped in a " " and some not. I would like them all to be imported without this (i.e. ignore it on rows that have it).
Some data:
"1;2;Red;3"
1;2;Green;3
1;2;Blue;3
"1;2;Hello;3"
Do you have any suggestions?
If you have a bash prompt hanging around...
You can use cat to output the file contents so you can make sure you're working with the right data:
cat filename.csv
Then, pipe it through sed so you can visually check that the quotes were delted:
cat filename.csv | sed 's/"// g'
If the output looks good, use the -i flag to edit the file in place:
sed -i 's/"// g' filename.csv
All quotes should now be missing from filename.csv
If your data has quotes in it, and you want to only strip the quotes that appear at the beginning and end of each line, you can use this instead:
sed -i 's/^"\(.*\)"$/\1/' filename.csv
It's not the most elegant way to do it in Tableau but if you cannot remove it in the source file, you could create a calculated field for the first and last column that strips the quotation marks.
right click on the field for the first column choose Create/Calculated Field
Use this formula: INT(REPLACE([FirstColumn],'"',''))
Name the column accordingly
Do the same for the last column
Assuming the data you provided fits the data you work on. The assumption is that these fields are integer field (thus the INT() usage). In case they are string fields you would want to make sure that you don't remove quotation marks that belong to the field value.