Create Array without SourceFile using Exiftool - exiftool

I would like to use Exiftool to create an array of -Artists for all files in a directory. Using the exiftool command below works but the output is not desirable:
C:\exiftool.exe -Artist Dir
The output looks something like this:
======== E:/File1.jpg
Artist : user1
======== E:/File2.jpg
Artist : user2
1 directories scanned
2 image files read
I would like the output to look something like this:
user1_user2
Or at least a simple array or output like:
user1
user2

To expand upon the answer I posted in the exiftool forums :
exiftool -q -s3 -Artist DIR
This will output each Artist tag on a line by itself, like your second example.
-q - Suppresses normal informational messages. This will suppress the file name listings.
-s3 - Short output format. In this case, the addition of 3 to the option only prints the values of the tag.

Related

creating a per sample table from a vcf using bcftools

I have a multi-sample vcf file and I want to get a table of IDs on the left column with the variants in which they have an alternate allele in. It should look like this:
ID1 chr2:87432:A:T_0/1 chr10:43234:C:G_1/1
ID2 chr2:87432_A:T_1/1
ID3 chr11:432434:T:G chr14:34234234:C:G chr20:34324234:T:C
This is to then read into R
I have tried combinations of:
bcftools query -f '[%SAMPLE\t] %CHROM:%POS:%REF:%ALT[%GT]\n'
but I keep getting sample IDs overlapping on the same line and I can't quite figure out the sytnax.
Your help would be much appreciated
You cannot achieve what you want with a single BCFtools command. BCFtools parses one VCF variant at a time. However, you can use a command like this to extract what you want:
bcftools +split -i 'GT="0/1" | GT="1/1"' -Ob -o DIR input.vcf
This will create one small .bcf file for each sample and you can then run multiple instance of bcftools query to get what you want

Instagram media.json How to add them to exif?

I get my photos out of Instagram and i get zip file with all the photos what i have there but they dosnt have any exif data on them.
The zip file has also a json file called media.json where all these important metadata are. So is there any ways to get the metadata to these photos exif?
Exiftool can import things from files to exif but first i need to know what kind of format the metadata file has to be?
This is a example of a what the instagram media.json file content is and what kind of format:
{
"photos": [
{
"caption": "#nautitaan #kesä2019",
"taken_at": "2019-06-08T03:30:25",
"location": "Jokioinen",
"path": "photos/201906/b65bbda42ba74424a9d7be0c5163f78d.jpg"
},
{
"caption": "#lupanauttia #kesä2019",
"taken_at": "2019-06-07T07:42:38",
"location": "Jokioinen",
"path": "photos/201906/29fb24838136a1e80439ad7dcae00b4f.jpg"
}
]
}
I only need these taken_at entries, all the other things are just plus.
Exiftool can read JSON files. If you run the command exiftool -g1 -a -s on your example JSON file, you will get a list of tag names you can use to copy into your image file. Using your example, the result would be
[JSON] PhotosCaption : #nautitaan #kesä2019, #lupanauttia #kesä2019
[JSON] PhotosLocation : Jokioinen, Jokioinen
[JSON] PhotosPath : photos/201906/b65bbda42ba74424a9d7be0c5163f78d.jpg, photos/201906/29fb24838136a1e80439ad7dcae00b4f.jpg
[JSON] PhotosTaken_at : 2019-06-08T03:30:25, 2019-06-07T07:42:38
The problem now is that because there are multiple items for each tag name. Exiftool tool is very flexible about how it reads numbers for time stamps (see exiftool FAQ 5), so if the first entry is the correct one, you can simply use
exiftool -TagsFromFile FILE.Json "-DateTimeOriginal<PhotosTaken_at" FILE.jpg
If you want to use the second entry, you can use the -listitem option.
exiftool -listitem 1 -TagsFromFile FILE.Json "-DateTimeOriginal<PhotosTaken_at" FILE.jpg
Note that the list index starts at 0, so to get the second item, you would index #1.
To bulk copy, assuming that the base filename of the json file is the same as the image file and in the same directory, you could use this command
exiftool -TagsFromFile %d%f.Json "-DateTimeOriginal<PhotosTaken_at" /path/to/image/files/
This command creates backup files. Add -overwrite_original to suppress the creation of backup files. Add -r to recurse into subdirectories. If this command is run under Unix/Mac, reverse any double/single quotes to avoid bash interpretation.

Change one word in lots of HTML pages with another word from a list of words

I have about 2000 HTML pages (all pages have the same content except for the city name). I have the list of city names, and i need each page to have 1 city name.
How can I change the City name in each page?
city name list: birmingham
montgomery
mobile
huntsville
tuscaloosa
hoover.. etc...
and I need to make each page like this:
title: birmingham,
next page;
title: montgomery,
and so on.
I need the change to happen in the title:Example (City Name)
and in 2 other h2 tags.
Thank you very much for your attention!
Update:
This script is for the existing files. It will hierarchically find all the index.html files in the current directory and will replace the "string_to_replace" string with that file's parent directory's name which is the city name in your case. It will also make that name capitalized before the replacement.
Feel free to update the tamplate_string variable value in the script so that it fits to the string which is used in your index.html files in place of the city name.
#!/bin/bash
template_string="string_to_replace"
current_dir=`pwd`
find $current_dir -name 'index.html' | while read file; do
dir=`basename $(dirname "$file")`
city="$(tr '[:lower:]' '[:upper:]' <<< ${dir:0:1})${dir:1}"
sed -i -e 's/'$template_string'/'$city'/g' $file
done
Initial answer:
My initial suggestion is to use a bash script (e.g. script.sh) similar to this:
#!/bin/bash
file="./cities.txt"
template="./template.html"
template_string="string_to_replace"
while IFS= read line
do
cp $template $line".html"
sed -i -e 's/'$template_string'/'$line'/g' $line".html"
echo "$line"
done <"$file"
and run it from bash terminal:
$ source script.sh
What you need to have:
cities.txt with cities names list, e.g.
London
Yerevan
Berlin
template.html with the html template you need to have in each file. Make sure the city name is set as "string_to_replace" in it, e.g. title: string_to_replace
Since you did not mention anything related to the files names, the files will be named like London.html, Yerevan.html,...
Let me know in case you don't need to create new files, and need to make replacement in the existing ones. In this case we'll need to update the script a bit after you tell me how you know which string is going to be used in the exact file.

Replace matches of one regex expression with matches from another, across two files

I am currently helping a friend reorganise several hundred images on a database driven website. I have generated a list of the new, reorganised image paths offline and would like to replace each matching image reference in the sql export of the database with the new paths.
EDIT: Here is an example of what I am trying to achieve
The new_paths_list.txt is a file that I generated using a batch script after I had organised all of the existing images into folders. Prior to this all of the images were in just a few folders. A sample of this generated list might be:
image/data/product_photos/telephones/snom/snom_xyz.jpg
image/data/product_photos/telephones/gigaset/giga_xyz.jpg
A sample of my_exported_db.sql (the database exported from the website) might be:
...
,(110,32,'data/phones/snom_xyz.jpg',3),(213,50,'data/telephones/giga_xyz.jpg',0),
...
The result I want is my_exported_db.sql to be:
...
,(110,32,'data/product_photos/telephones/snom/snom_xyz.jpg',3),(213,50,'data/product_photos/telephones/gigaset/giga_xyz.jpg',0),
...
Some pseudo code to illustrate:
1/ Find the first image name in my_exported_db.sql, such as 'snom_xyz.jpg'.
2/ Find the same image name in new_paths_list.txt
3/ If it is present, copy the whole line (the path and filename)
4/ Replace the whole path in in my_exported_db.sql of this image with the copied line
5/ Repeat for all other image names in my_exported_db.sql
A regex expression that appears to match image names is:
([^)''"/])+\.(?:jpg|jpeg|gif|png)
and one to match image names, complete with path (for relative or absolute) is:
\bdata[^)''"\s]+\.(?:jpg|jpeg|gif|png)
I have looked around and have seen that Sed or Awk may be capable of doing this, but some pointers would be greatly appreciated. I understand that this will only work accurately if there are no duplicated filenames.
You can use sed to convert new_paths_list.txt into a set of sed replacement commands:
sed 's|\(.*\(/[^/]*$\)\)|s#data\2#\1#|' new_paths_list.txt > rules.sed
The file rules.sed will look like this:
s#data/snom_xyz.jpg#image/data/product_photos/telephones/snom/snom_xyz.jpg#
s#data/giga_xyz.jpg#image/data/product_photos/telephones/gigaset/giga_xyz.jpg#
Then use sed again to translate my_exported_db.sql:
sed -i -f rules.sed my_exported_db.sql
I think in some shells it's possible to combine these steps and do without rules.sed:
sed 's|\(.*\(/[^/]*$\)\)|s#data\2#\1#|' new_paths_list.txt | sed -i -f - my_exported_db.sql
but I'm not certain about that.
EDIT<:
If the images are in several directories under data/, make this change:
sed "s|image/\(.*\(/[^/]*$\)\)|s#[^']*\2#\1#|" new_paths_list.txt > rules.sed

how to find the difference between a csv file and a file containing only one column of this csv

I have a CSV file containing some user data it looks like this:
"10333","","an.10","Kenyata","","Aaron","","","","","","","","","",""
"12222","","an.4","Wendy","","Aaron","","","","","","","","","",""
"14343","","aaron.5","Nanci","","Aaron","","","","","","","","","",""
I also have a file which has an item on each line like this:
an.10
arron.5
What I want is to find only the lines in the CSV file contained in the list file.
So desired output would be:
"10333","","an.10","Kenyata","","Aaron","","","","","","","","","",""
"14343","","aaron.5","Nanci","","Aaron","","","","","","","","","",""
(Note how an.4 is not contained in this new list.)
I have any environment available to me and am willing to try just about anything aside from manually doing so as this csv contains millions of records and there are about 100k entries in the list itself.
How unique are the identifiers an.10 and the like?
Maybe a very small *x shell script would be enough:
for i in $(uniq list.txt); do grep "\"$i\"" data.csv; done
That would, for every unique entry in the list, return all matching lines in the csv file. It does not match exclusively on the second column however. (That could be done with awk for example)
If the csv file is data.csv and the list file is list.txt, I would do this:
for i in `cat list.txt`; do grep $i data.csv; done