Writing Meta Information

Abstract

Writing meta information is more complicated than it may appear at first glance, this may be one reason why there are very few utilities around that do it. ExifTool uses tag names to identify the different pieces of meta information that can be extracted from a file. At last count there were about 1000 different tags that ExifTool recognizes, and many of these tag names are common between different metadata formats (the WhiteBalance tag is the worst offender, and can be found in 12 different places), and sometimes the information can even be stored in different places within a single format. Couple this with the fact that many manufacturers store meta information in undocumented formats which must be reverse engineered, and you have a very complex situation.

ExifTool attempts to simplify this situation as much as possible by making reasonable decisions about where to write the information you specify, yet it maintains flexibility by allowing you to configure its priorities if necessary, or even override the decision making process entirely.

Background

For a long time, I resisted adding write abilities to ExifTool even though it was an oft-requested feature. My concerns in adding this feature were:
  1. It would complicate the ExifTool interface and make it too confusing for typical users.
  2. It would complicate the code enough to slow down processing for normal use.
  3. It would take a LOT of work to implement.
After stewing on the problem for a while, I was finally able to overcome my concerns:

1. I designed an interface that I think is easy to use for people who don't want to know the details of the file structure, yet powerful enough for people who want to do very specific things to the information.

2. I isolated all of the writing code as much as possible into separate files which autoload as required. This keeps the compilation fast for people who don't require the write feature. Also, I have left the reading routines unchanged, so they aren't slowed down by the extra code needed when writing information. Unfortunately, this meant I couldn't borrow a lot of code from the read routines (even more work for me!), but it had the advantage that I could perform additional optimizations in the write routines that I couldn't do otherwise. Although the startup costs of this implementation are fairly high (for writing only), it should be quite fast for batch writing of multiple files.

3. I decided to bite the bullet and invest the time required (...guess what I did for my Christmas vacation!). Although I thought that a big project like this would be better suited to C++ (faster execution and a broader potential user base), after programming this so far in Perl I have grown to really appreciate the automatic memory handling and other great features of Perl such as hash lookups and incredible flexibility in text manipulations afforded by regular expressions.

Current Implementation

Currently, ExifTool can write most of the EXIF tags that anyone could reasonably want to change (some tags are not made writable because they describe physical characteristics of the image that you can not change with ExifTool, ie. Compression). Also, all of the GPS, IPTC and XMP information and most of the MakerNotes information can be edited. This gives you great power, but with great power, comes great responsibility...

It is possible for you to write nonsense into a file, which could cause other image readers to throw up their hands in despair and refuse to read the image. For this reason, it is best to always preserve the original copy of your image file. The 'exiftool' script does this for you automatically by renaming the original file and always working on a copy.

The writing logic for ExifTool is the reverse of the reading logic. You provide human-readable values and ExifTool will perform the conversions for you. For instance, you can set 'WhiteBalance' to 'Daylight' and ExifTool will change the value of WhiteBalance in the image wherever the tag is found provided that 'Daylight' is a valid value for that location. ExifTool will even do some simple matching so that you could even just set it to 'day', and ExifTool will search through the valid values and will choose the one that contains the string 'day'. If the value is ambiguous, the tag will not be set. If no tags can be set with the specified value, ExifTool returns an error message.

The tag values can also be specified at a numerical level, disabling the print conversions that are normally applied. This can be done on a tag-by-tag basis via the API, or on a global basis with the exiftool application using the -n option.

As well as changing tag values wherever they are found in the image, exiftool will also create the tag in the preferred group if it didn't exist there before. By default, the preferred group is the first of the following where the tag is found: 1) EXIF, 2) GPS, 3) IPTC, 4) XMP, 5) MakerNotes. Alternatively, the desired group can be specified so ExifTool only writes the tag to a single location. Currently, family 0 group names (EXIF, GPS, IPTC, XMP, MakerNotes) and family 1 EXIF directory names (IFD0, IFD1, ExifIFD, GlobParamIFD, InteropIFD, SubIFD) may be specified, but I plan to eventually allow other family 1 group names to be used as well.

If a tag is added to a group that doesn't exist, the new group is created in the file. Conversely, if the last tag is deleted from a group, the group is removed from the file.

Known Problems

Large Preview images in JPG files may be lost when ExifTool rewrites the file. These images are used internally by many digital cameras for quick review of pictures, but they aren't recognized by most other software since they are accessed via proprietary information in the maker notes. Preview images that fit inside the EXIF segment are preserved.

Some data formats have mandatory tags which are currently not written automatically. This is a potential problem if a tag is added to a group that didn't exist before.