How do I use IPTC/EXIF metadata to categorise photos? - metadata

Many photo viewing and editing applications allow you to examine and change EXIF and IPTC data in JPEG and other image files. For example, I can see things like shutter speed, aperture and orientation in the picture files that come off my Canon A430. There are many, many name/value pairs in all this metadata. But...
What do I do if I want to store some data that doesn't have a build-in field name. Let's say I'm photographing an athletics competition and I want to tag every photo with the competitor's bib number. Can I create a "bib_number" field and assign it a values of "0001", "5478", "8124" etc, and then search for all photos with bib_number="5478"?
I've spent a few hours searching and the best I can come up with is to put this custom information in the "keywords" field but this isn't quite what I'm after. With this socution I'd have to craft a query like "keywords contains bib_number_5478" whereas what I want it "bib_number is 5478".
So do the EXIF and/or IPTC standards allow addtional user-defined field names?
Thanks
Kev

It can be used for that, but it really shouldn't: it's meant to be user-editable and so isn't a safe place to put critical metadata. Using an XMP sidecar is better for this kind of thing: in XMP, any field added that a given app does not understand is, according to the standard, supposed to be ignored by that app and not destroyed.

I don't know if there are applications to do this but by the standards described for JPEG files there is a field called Comments where you can assign values that could act like tags.
C# code:
using System.Windows.Media.Imaging;
using System.IO;
...
FileStream fs = new FileStream(#"<img_path>", FileMode.Open, FileAccess.ReadWrite);
BitmapMetadata bmd = (BitmapMetadata)BitmapFrame.Create(fs).Metadata;
bmd.Comment = "Some Comment Here";
also if you are looking for an application that already has this functionality built into it, then might i recommend Irfan View (open pic, go to Image menu, click on Comments button).
Hope this helps.

Related

keep/copy XMP with libexif

I try to add a thumbnail to a JPEG picture using libexif.
For now I'm borrowing the code from exif (the command line tool that is shipped by the libexif team).
However I noticed the XMP tags get deleted from the metadata. There is an old bugreport here.
I tried to see how to achieve this anyway with libexif but I don't really understand how to get the XMP from input file and put it in the output file. I just want to copy all XMP data, I don't need to extract anything of it.
I saw there is a TAG EXIF_TAG_XML_PACKET in exif_tag.h but couldn't figure out how to read/write this tag.
A related solution is in this SO answer but it looks complicated. I'm not familiar coding in C.
Is it actually possible to keep all XMP when using only libexif API? Have things changed in recent years on that? How would you write this in code?
Thanks
I believe it should be somewhat straightforward. XMP fields are described in the ISO/Adobe standard. Regular Kotlin/Java/Android file I/O and some string manipulation should be all that is required.
I would start out by becoming intimately familiar with ISO 16684-1:2019. Then, write a method for your jpeg file class that grabs all the XMP fields. Store those fields in a temp file (to prevent difficult to recover data loss in the event of your code or libexif crashing). Hand the file off to libexif. Generate the thumbnail. Finally, when that's done you can restore the XMP fields. If the thumbnail is stored in an XMP field as well (and it sounds like it is), it may be easier to concatenate that field with the other ones which were already grabbed, updating the temp file so that it contains EVERY XMP field, before adding all of the XMP fields back to the jpeg.
Unfortunately, I do not currently have the time to read a 50 page ISO standard, synthesize the information, and then write the code to implement the solution. Here's a link to the standard at least, to get you started.
https://www.iso.org/obp/ui/#iso:std:iso:16684:-1:ed-2:v1:en

Modifying MP3 file header text info

Is there a method I can use in a program to access and modify ripped music CD-MP3 file(s) header text?
There is a method available in the MusicMatch jukeboks music player, but with 2000 files ripper from 50 CD's, the job is quite formidable and the tool "supertagging' is cumbersome to use.
What I see for me is more like the visual representation of Excel, where I would have just the three fields Artist name, Song title and Album name. displayed.
The Artist field would have the option of repeating the top field down for all the song titles, Album would always be repeted for all song titles.
Song titles wil of course have to be entered for each item.
In the ripped files, every file has the fields track#, artist, album + some of less importance.
Just let me know if I am at the wrong forum for my search. I just don't know anywhere else that I might go.
For programming I might use Visual Foxpro and/or assembler. I haven't used C since early 1980's.
If you really want to develop it yourself, at least use an ID3 library, don't write the functionality yourself!
A good one is at http://id3lib.sourceforge.net/. I haven't tried it recently, but I'm sure you can call it from VFP somehow.
If you just want something that is better for tagging a shed-load of files, look at MediaMonkey.
If you want to work solely in VFP then you should use the VFP low-level utilities
FOPEN()
FCHSIZE( )
FCLOSE( )
FCREATE( )
FEOF( )
FFLUSH( )
FGETS( )
FPUTS( )
FREAD( )
FSEEK( )
FWRITE( )
These are pretty well documented within the VFP Help system and there are numerous examples on the web.
With them you can get the 'raw' data from the MP3 file, identify what you are looking for, change it, and write it back again.
The downside is that specific 'fields' (e.g. Artist name, Song title and Album name, etc.) will not be readily recognized. You would need to write code to identify these and then identify where the values reside.
Good Luck

How to Convert IPicture to Image - .NET 4.5 TagLib Sharp

I am wanting to display the album artwork of a song (accessed via the taglib-sharp library) within a Windows Forms picture box. The problem I'm running into is that the taglib-library returns an image of type TagLib.IPicture whereas the picture box requires an object of type System.Drawing.Image.
I have scoured the internet for many hours now, looking for a way to convert from an IPicture to Image, but to no avail. The best lead I have is this: http://msdn.microsoft.com/en-us/library/system.windows.forms.axhost.getpicturefromipicture.aspx, but I have yet to see a successful example of how to implement this.
Any help as to how to convert between these two types would be much appreciated. Note: IPicture is not analogous to IPictureDisp in this case.
I've done the opposite before - turning an existing .jpg into an IPicture for embedding in an .mp3 file. I just tried reversing that operation and, after tweaking and testing, came up with this:
TagLib.File tagFile = TagLib.File.Create(mp3FilePath);
MemoryStream ms = new MemoryStream(tagFile.Tag.Pictures[0].Data.Data);
System.Drawing.Image image = System.Drawing.Image.FromStream(ms);
Thanks for the question - I already know how I'm going to use this myself!
Update: Here's the other way (.jpg to IPicture that I've done before):
tagFile.Tag.Pictures = new TagLib.IPicture[]
{
new TagLib.Picture(new TagLib.ByteVector((byte[])new System.Drawing.ImageConverter().ConvertTo(System.Drawing.Image.FromFile(jpgFilePath), typeof(byte[]))))
};

How to use PDFTextExtractor on iTextSharp

I want to retrieve the text from a pdf file using iTextSharp. However, I wasn't able to use PDFTextExtractor as in JAVA library of itextsharp(itext). I need readPDFOffline class to return content of file. I will give the pseudo below for you to understand well what I want.
private string readPDFOffline(string fileUri);
read PDF;
retrieve Text Content of This Pdf;*
save content into string contentOfflineFile;
return contentOfflineFile;
I would like to do the * part of Code
PdfTextExtractor is present in the most recent releases of iTextSharp, available here.
Retrieving text in PDF is not easy. Not impossible, but there are times when the only thing that will work is OCR. For all other cases, PdfTextExtractor should work. Cases of it not working are considered bugs and should be reported as such.
Be aware that there are several cases where what looks like valid text is not extractable:
Text with no encoding... just glyph indexes. OCR time.
"Text" that is just raw paths. Horribly inefficient, and time for more OCR.
"Text" that is pixels in a bitmap. OCR once more.
OCR: Optical Character Recognition. There's even a reasonably good one for free available on Google Code, though I don't recall the name off the top of my head.

saving segmented images from a number plate

hye everyone i m doing my project "automatic vechicle identification".i have to design software in matlab,i have done extraction of plate region ,segmentation of characters,...now i want to save these segmented characters so that i can further recognize these character from a data base......any body can help me please feel free to write to me, thanks in advance
So if you have some data, myData then you can just issue a command save myData and you will have a new file in the current directory named myData.mat. To load the data later, just type in load myData and then you will have a new variable in the workspace named myData. There's lots more you can do with this, so you should check out help save.
Alternatively you could use a database. I've never actually used a database in Matlab, but there seems to be plenty of information about how one would go about doing this: http://www.mathworks.com/help/toolbox/database/ug/database.html