Meta Data
Metadata at CEPIC 2007
After years of fringe meetings attended by a handful of people, metadata has now emerged into the mainstream. Two major conferences on the subject were hosted by CEPIC at its congress in Florence this year.
The first ever international Photo Metadata Conference, ‘ Working towards a Seamless Photoflow’ was organised by the IPTC to take place at the CEPIC conference. Speakers included representatives from Reuters and Stern, Adobe and Apple, Hasselbad and Microsoft, Fotoware and Keystone, and BAPLA and PLUS. The aim was to discuss a broad range of topics on metadata in the photo workflow.
‘Lost in Translation’ was organised by the Bridgeman Art Library as part of a European project on use of metadata in the cultural heritage sector. Speakers at the MILE conference included representatives from The Bridgeman Art Library, The Museums Documentation Association (MDA) and Alinari. The seminar looked in particular at image retrieval and keyword translation systems.
Both events looked at the difficulty of retrieving images on the internet, and on one issue every one was in agreement: there are too many images out there. As the dollar image shoots to prominence and bright young companies try their luck in the microstock market, the need for more focussed image retrieval is ever more pressing. And that means paying attention to the quality of metadata and to its location.
The image buyer wants relevant images, and the image seller wants their own images seen. It’s a delicate balance. Searches are bringing up too many irrelevant results, partly because of the scatter gun approach to keywording and partly because of the ambiguous nature of language itself; and that’s before you even look at the ‘lost in translation’ issues of multilanguage search.
At the IPTC conference, Andreas Trampe from Stern told us how his picture editors wade their way through 12,000 images a day, with no respite, even at the weekend; the images just keep on coming. Lack of precision in keywording, and the fact that it is often easier to transmit an image than to edit it, means that the job of picture editor now borders on the surreal.
The needs of the customer will prevail, one way or the other, but defining them is another matter. At the MILE conference Xavier Castelle from picture agency AISA in Barcelona compared the benefits of controlled vocabulary and freetext searches. While controlled vocabularies provide a more precise means of finding pictures, users prefer the Google search. But as Andreas Trampe asked, how many of us have ever seen page 81 on a Google result?
Multilingual keywording is a high priority for both commercial agencies and the cultural sector. Controlled vocabularies are the only meaningful way of getting achieving automated translation. As Castelle pointed out, the free text search is much easier to adapt to other systems; a list of keywords is easy to send to an agent. If you use controlled vocabularies you have to agree on a vocabulary. With image libraries competing to create the best and most usable keywording systems, agreement on vocabularies might be a tall order.
Words have their limits, but precision depends on getting rid of ambiguities. This is the basis of the so called ‘semantic’ images search – which uses networks of words to create precise meanings and increase relevancy, and bringing in visual search mechanisms to assist the task of ‘disambiguating’ the word data. Take the keyword ‘tiger’. It can find cats, butterflies, Tiger Woods, army tanks and operating systems, just for starters.
Blasting keywords at pictures and using a thesaurus as catch- all, is no longer sufficient. Agencies are developing new filters and ranking systems to narrow the search. A visual recognition search like the one currently being tested by Corbis (http://corbis.ltutech.com) could remove ambiguity by using machine remembered images of cat tigers or tank tigers to find other similars.
Designer friendly categories like composition and viewpoint, are coming into vogue, as on iStockphoto, where designers can also use a tool to define space for copy. Ranking systems like AlamyRank and Masterfile’s People’s Choice’, use previous customer behaviour to bring up the ‘best’ images.
Use of IPTC fields is only now becoming common as picture libraries grasp the benefits of carrying information in the image file, and the dangers of letting images become orphans. Use of the BAPLA/Pic4press panel, demonstrated at the conference, is one way to encourage libraries and photographers to enter key information consistently. The IPTC is going further to address some of the gaps in the current schema, which was originally conceived for use in the news industry. The organisation’s White Paper, presented at the conference, sets out ways in which IPTC Core can be extended and improved for use in the broader picture industry, including the stock image sector
Identifying the source of an image is high on the agenda for everyone. Many photographers would like a unique ID to track to the copyright holder. Can an ID be hardwired into the image metadata, they ask? It can’t. There is always a means of removing metadata from an image - even EXIF data can be stripped- and the XML fields were conceived to be freely used, not to be locked in. However, the IPTC is looking a the option of tracking changes in metadata, forming layers or versions, which would indicate that changes have taken place.
Automation is one of the drivers for metadata adoption; clearly there is scope for some information to be added by the photographer at the point of image capture. Some information could be automatically recorded, such as sat nav location information, date, or unique ID. Should caption and other job information be added at that point? There are differing viewpoints on this, but the IPTC Committee will be looking into the possibilities, both technical and practical. Camera manufacturers Nikon and Canon were at the conference - the first time they have appeared at an event of this kind. They asked what we would like to see recorded in the camera. Answering this question is another challenge for the industry as it looks at how to make image search fit the demands the expanding web.
© Sarah Saunders, Electric Lane
Published in Visuell August 2007
