Cleaning up F-Spot's database

If you delete any images managed by the photo management application F-Spot, they will none the less show up in F-Spots thumbnail overview. f_spot_cleanup will add the missing database maintenance function to F-Spot.
Version 0.3 now comes with support for newer versions of F-Spot, newer versions of Python, and UTF-8 encoded non-ascii characters.
Download   |  Read more...

Problem description

F-Spot is a personal photo management application for the GNOME desktop (and at one point indeed was GNOME's standard application). It's reasonably nice to use, presenting your image by either date or by "filmroll" (i.e. a set of images all imported at the same time).

So what's wrong with it? Well, two of the most annoying quirks actually play hand in glove (this certainly is true for F-Spot 0.4.3.1, which came with Ubuntu 8.04, but I believe it to be true even today, i.e. in 2011):

  • When importing photos from the same medium N times (say your cameras SD card, and you didn't format inbetween), you will actually end up with N copies of the same photos. Worse still, depending on your exact usage pattern it is very easy to end up with N photos which, while all having the same image content, differ slightly in their header, making bog-standard duplicate finding impossible (I actually wrote a script which only compares image content specifically for that reason, but note that this was supposedly fixed at the end of 2009, that is 0.5.0.3 or later, but broke again for 0.8).
  • However, even when all these duplicate files have been weeded out, F-Spot will still insist in listing them in its overview (and even show thumbnails), which is most annoying. A bug report against this behaviour was filed in 2005, and duplicates of this bug report crop up ever so often, but so far to no avail. Apparently the maintainer holds that deleting photos outside of F-Spot is an error (it is, after all "fully-featured", so there can not be a reason to use auxiliary programs), and if anything the bug is with the program deleting the photos...

Solution

If you google for "f-spot clean database" you will find a few scripts which will go through F-Spots database and delete all entries for which a file can not be found.

Well, my solution does exactly the same thing. Where it differs from existing solution is that it will, by default, only delete entries for files whose parent directory still exist (this prevents quite a bit of trouble with removable media), handles spaces in filenames (which are a bad idea nonetheless), and generally comes with quite a few more options then most of the solutions I have seen so far.

Installation

So why not give it a try? Download it, drop the script f_spot_cleanup someplace in your PATH and the manpage someplace in your MANPATH, and off you go. Maybe use the --dry-run (-n) option first, to get an idea what it will do for you, then finally get rid of all these zombie entries (and their thumbnails)...

Prerequisites

This is essentially a python script (packed with a slightly modified version of Fredrik Lundh's Squeeze – a small python script from 1997). It therefore needs Python to run. Version 0.3-pre was tested (although not excessively, hence the pre) against Python versions 2.5.2, 2.6.6, 2.7 and 2.7.1+.

Compatibility

Version 0.3-pre was tested (although not excessively, hence the pre) against F-Spot versions 0.4.3.1, 0.6.2, 0.8.0, 0.8.2. Particular attention was paid to the handling of (UTF-8 encoded) non-ascii characters.

Known Limitations

  • Version 0.3-pre is believed to run correctly on all of f-spot's database version except the first 6 (which are probably only of historic interest anyway).
  • In oder to accommodate bugs in several versions of f-spot, it will do nothing if two files exist where the name of the second is that of the first with at least one character URL-encoded (e.g. "Image 1.jpg" and "Image%201.jpg").
  • If the file name uses non-ascii characters which are not UTF-8 encoded, great care should be taken (use '-i') as it depends on your particular version of Python whether this will work or not.

Version history

f_spot_cleanup-0.3-pre
Pre-release, dated 22.08.2011. This is a pre-release of version 0.3, which will be the first one to officially support newer versions of f-spot (0.6-0.8).
Changes:
  • Now also handles files with more than one version.
  • (probably) supports all database versions between 7 (14, the earliest version I know, was used in 0.4.3) and 18 (used in 0.8.2).
  • Support for (partly URL-encoded) filenames containing (at least) UTF-8 characters. Whether other character sets would work is untested.
  • Complete rewrite of the DB-access.
  • The name was changed (underscores instead of hyphens) to allow the use of squeeze.
As this version is distributed as a packed script, you might want to download the source separately.
f-spot-cleanup-0.2
Bugfix release, dated 6.2.2011. Works with Python 2.5.2 and F-Spot 0.4.3.
f-spot-cleanup-0.1
Initial release, dated 2.2.2011.