Extract previews in batch and name preview using inodes? - exiftool

Is is possible to extract previews in batch (from images in a folder) and name the preview using the image's inode?
Example output:
path/to/previews_folder/{image_parent_folder_inode}/{image_inode}.jpg

Your question is a bit unclear, but if you are asking if exiftool can directly access inodes, then the answer is no. Exiftool does have it's FileName and Directory pseudo-tags which can be accessed and manipulated.

Related

File Date to Exif Data

I have a bunch of old folders containing photos from 2000s.
For an unknown reason photos are empty of Exif data.
But I want, at least, to keep the dates of each photo. So that they are in order when I import them in iCloud Photos.
Is there a software, or in command line way, or a script in any programming language, able to take each date inside 'file properties' and insert it in the photos' Exif data?
Thanks in advance.
You should be able to do this with exiftool. Make a backup of your files first, then try this on a single file:
exiftool "-alldates<filecreatedate" ONEIMAGE.jpg
Then check your "file properties" that you refer to and also check with:
exiftool ONEIMAGE.jpg
If that all looks correct, you can do all files in a directory like this:
exiftool "-alldates<filecreatedate" DIRECTORYNAME

Is it possible to extract metadata such as Content Created date from files - I can't get this with PowerShell

I need to extract the "Content Created" date out of thousands of files, but haven't been able to find a way to do this using PowerShell / other Command Line utility.
Does someone out there know a way to obtain this metadata? If so, please can you advise me. Thanks.
I've looked at various resources online, including this site, but haven't been successful thus far.
Here's a screenshot explaining what I'm trying to do.
I've been unable to find a native powershell cmdlet which does what you want. However, I found this article: Use PowerShell to Find Metadata from Photograph Files and the script it used: get file meta data function.
The article talks about image files, but the function is not specific for image files.
I tested it out on a folder containing a Word and an Excel file and the returned Metadata from the Word file contains the Content Created date. The Excel file does not contain/return that value. This is not unexpected as the Details tab of properties for the Excel file does not contain a Content Created value so it seems to be specific for Word files, and maybe some other file or document types.
Update:
You write that you need to extract this info from thousands of files, but if those files are anything but Word-files you probably won't be able to do that.
As far as I can tell this should work with the file types exposing the type of metadata you want. However, it seems that the ContentCreated property is unique to Word. I tried adding a text file (.txt), Acrobat PDF (.pdf), MS Access (.mdb), Excel (.xlxs) and a Word doc (.docx) file to my test folder and the only one that has/returns that metadata property is the Word file.
You should also be aware that the script seems to return metadata localized, so for me to programatically get the info i wanted I had to pipe the output of the script to Select-Object -Property Name,'InnehÄll skapat' (which is the Swedish name for Content created). So if you're running on a non-english system you may need to check what the output looks like before creating your Select-Object statement.
PowerQuery in Excel 2013 or later (data tab). Connect to data> Folder.

Does exiftool require the complete file for extracting metadata

This question is about extraction of metadata only.
Is it required for exiftool to get a complete file for propperly working?
Scenario:
I want to extract the metadata of a 20 GB video file. Do I need to provide exiftool with the complete file (via stdin), or is it enough to provide it with a certain amount of bytes.
Motivation:
I am programatically (golang) calling exiftool in a streaming context and want to have the extraction as fast as possible. Magic numbers for filetypes work with the first 33 bytes and I am wondering if that is possible with the exiftool metadata as well.
The answer depends upon the file and the location of the metadata within that file.
There are a couple of threads on the subject on the ExifTool forums (link 1, link 2) and Phil Harvey, the author, says that about half the time the in the case of MP4/MOV videos, the metadata is at the end of the file.
Using the -fast option might help. I've done some quick tests using cURL and a large image file (see the second to the last example under Piping Examples) and in that case cURL didn't download the whole image file, just enough to extra the metadata. It might be different with a video file though, as I haven't tested that situation.

merging PDFs with Ghostscipt ignoring outline and using pdfmark instead

I am using a Batch script to merge different PDFs in one complete file.
%gsc% -dBATCH -sDEVICE=pdfwrite -sPAPERSIZE=letter -dEPSFitPage -o %dsk%%zus%%ext% %mfd% %pth%tmp\pdfmarks
%dsk%%zus%%ext%: Path and name of final (complete) document
%mfd%: Path and name of docs to be merged (c:\test\1.pdf c:\test\2.pdf ...)
%pth%tmp = path to the pdfmarks file
Additionally, I am creating a pdfmark document inside the script which gs uses to create the bookmarks. But unfortunately, some of the docs I am merging, have already their own bookmarks and I did not yet find a solution how to ignore those. GS should only use the bookmarks inside the pdfmarks file.
How can this be done?
Firstly; you are not 'merging' PDF files when you use Ghotscript's pdfwrite device. The process is described in detail here
The important point is that the way the input file(s) are constructed has no bearing on the way the output file is constructed. If any other software you use relies on the file being constructed in a particular fashion it may not work on the output PDF file.
The -dEPSFitPage switch only has any effect when the input is an EPS file. If you want to 'fit' PostScript or PDF files then you need to use -dPDFFitPage, -dPSFitPage or just -dFitPage. However, all of these rely on you first selecting a media size, and then preventing it being altered by setting -dFIXEDMEDIA. For EPS files you would more normally use -dEPSCrop which sets the media size to the EPS declared BoundingBox.
You can prevent the PDF interpreter reading the Outlines tree (which you are calling Bookmarks) and then creating a pdfmark from it to pass to the pdfwrite device by using the -dNO_PDFMARK_OUTLINES switch which oddly isn't documented, presumably an oversight.

Extracting file names from an online data server in Matlab

I am trying to write a script that will allow me to download numerous (1000s) of data files from a data server (e.g, http://hydro1.sci.gsfc.nasa.gov/thredds/catalog/GLDAS_NOAH10SUBP_3H/2011/345/). Unfortunately, the names of the files in each directory are not formatted in a similar way (the time that they were created were appended to the end of the file name). I need to be able to specify the file name to subset the data (I have a special tool for these data types) and download it. I cannot find a function in matlab that will extract the file names.
I have looked at URLREAD, but it downloads everything including html code.
Thanks for your help!
You can easily parse the link.
x=urlread(url)
links=regexp(x,'<a href=''([^>]+)''>','tokens')
Reads every link, you have to filter all unwanted links.
For example this gets all grb files:
a=regexp(x,'<a href=''([^>]+.grb)''>','tokens')