Replace IPTC keywords in picture files - powershell

I need to change the IPTC keywords in .tif picture files.
As we exported these .tif files out of a system, the keywords all got into one string and now you can't filter on those anymore.
I want to read the keywords of every file in a folder, split them and write them back in to the file.
I searched for a solution, but I can't find anything how to read and edit the IPTC keywords with PowerShell.
I hope you can help me.

Related

Rename a group of .txt files based on content that appears after a specific string in the text file?

I have a folder with hundreds of text files in it. The file names have zero meaning. I am trying to extract a variable number that appears after a string in each file and and rename the files using it. I'm somewhat of a Powershell newbie (an old timer form the DOS world) but I have written many useful scripts with it. I've searched high an low, this one has me stumped.
Any and all suggestions are welcome, or I'd happily pay someone for the snippet of code. -Ed

Is it possible to extract metadata such as Content Created date from files - I can't get this with PowerShell

I need to extract the "Content Created" date out of thousands of files, but haven't been able to find a way to do this using PowerShell / other Command Line utility.
Does someone out there know a way to obtain this metadata? If so, please can you advise me. Thanks.
I've looked at various resources online, including this site, but haven't been successful thus far.
Here's a screenshot explaining what I'm trying to do.
I've been unable to find a native powershell cmdlet which does what you want. However, I found this article: Use PowerShell to Find Metadata from Photograph Files and the script it used: get file meta data function.
The article talks about image files, but the function is not specific for image files.
I tested it out on a folder containing a Word and an Excel file and the returned Metadata from the Word file contains the Content Created date. The Excel file does not contain/return that value. This is not unexpected as the Details tab of properties for the Excel file does not contain a Content Created value so it seems to be specific for Word files, and maybe some other file or document types.
Update:
You write that you need to extract this info from thousands of files, but if those files are anything but Word-files you probably won't be able to do that.
As far as I can tell this should work with the file types exposing the type of metadata you want. However, it seems that the ContentCreated property is unique to Word. I tried adding a text file (.txt), Acrobat PDF (.pdf), MS Access (.mdb), Excel (.xlxs) and a Word doc (.docx) file to my test folder and the only one that has/returns that metadata property is the Word file.
You should also be aware that the script seems to return metadata localized, so for me to programatically get the info i wanted I had to pipe the output of the script to Select-Object -Property Name,'InnehÄll skapat' (which is the Swedish name for Content created). So if you're running on a non-english system you may need to check what the output looks like before creating your Select-Object statement.
PowerQuery in Excel 2013 or later (data tab). Connect to data> Folder.

Combining pdftk strings for specific pages

I've checked "Similar questions" and went through a lot of search but I can't seem to find a way to combine the snippets I already figured out; would be awesome if someone is able to help.
Using pdftk, alternatively running through PowerShell
I got two .pdf files (f.e.: A=1000 pages, B=5000 pages) which I need to combine in a specific way to generate a new .pdf file. In detail I need page 1-3, 4-6[...] of file A merged with page 1-4, 4-8[...] of file B with a blank page between 1-3 & 4-6.
So far I figured how to burst the files, add a blank page and combine them to a new .pdf file. Yet I'm only able to that for one needed document at a time (a new file with 8 pages).
pdftk fileC.pdf fileD.pdf cat output fileE.pdf
pdftk A=fileE.pdf B=blankpage.pdf cat A1-1 B1-1 A2-4 output conclusion.pdf
Now I'm wondering if there's a way to output the complete file with a command? Otherwise I'd have to do it for every merge of two long files.
Thanks in advance!

Extracting file names from an online data server in Matlab

I am trying to write a script that will allow me to download numerous (1000s) of data files from a data server (e.g, http://hydro1.sci.gsfc.nasa.gov/thredds/catalog/GLDAS_NOAH10SUBP_3H/2011/345/). Unfortunately, the names of the files in each directory are not formatted in a similar way (the time that they were created were appended to the end of the file name). I need to be able to specify the file name to subset the data (I have a special tool for these data types) and download it. I cannot find a function in matlab that will extract the file names.
I have looked at URLREAD, but it downloads everything including html code.
Thanks for your help!
You can easily parse the link.
x=urlread(url)
links=regexp(x,'<a href=''([^>]+)''>','tokens')
Reads every link, you have to filter all unwanted links.
For example this gets all grb files:
a=regexp(x,'<a href=''([^>]+.grb)''>','tokens')

C# folder and subfolder

Upon numerous searches, I am here to see if someone has any idea on how I should go about tackling this issue.
I have a folder with sub-folders. The sub-folder containers each has files of different file types e.g. pdf, png, jpeg, tiff, avi and word documents.
My goal is to write a code in C# that will go into the subfolder, and combined all the files into one pdf using the name of the folder. The only exception is that a file such as avi will not be pdf'ed in which case I want a nudge as to which folder it is and possibly file name. I am trying to use the form approach, so that you can copy in the folder pathname and also destination of the created pdf.
Thanks.
to start, create a FolderBrowserDialog to get the root folder. Alternatively just make a textbox in which you paste the folder name ( less preferred since the first method gives you nicer error-handling straight out of the box )
In order to iterate through, see How to: Iterate Through a Directory Tree
To find the filetype, check System.IO.FileInfo.Extension for each file you iterate through. Add those to list with the data you need. ( hint, create a list of objects in which your object reflects the data you need such as path, type etc,... ). If its an avi don't toss it in the list but flash a warning (messagebox?) instead.
From here the original question gets fuzzy. What exactly do you need in the pdf. Just the filenames and locations or do you actually want to throw the actual contents of the file in pdf?