Reverse Engineering the Slide Library

The Media Center is delighted to announce it has received a one-year Sparks! Ignition grant from the Institute for Museum and Library Services. This award will fund a project to assess the feasibility of using deep learning and computer vision to automatically sort and catalog digitized 35mm slides, with the goal of creating an open-source, scalable framework for archival discovery in legacy slide collections worldwide.

The Columbia University Department of Art History and Archaeology has a library of over 400,000 35mm slides collected, curated, and created by faculty and students during the latter half of the 20th century. The collection covers a vast geographic and temporal scope of topics in the fields of art history, archaeology, and anthropology. Both the slide images and labels are an important art historical resource that remains unused due to the obsolescence of the medium. The labels provide a teaching bibliography and a record of subject areas and artworks taught in the department over the last 60 years. Additionally, many of the slides are unique fieldwork photography completed by Columbia faculty and students for original research. As with most 35mm slide libraries, a master catalog for the collection was never created.

The size of the collection, coupled with the lack of a master catalog, makes it preventatively difficult to access the significant image resources in the collection- a problem faced by many slide collections. This project will use several computer science techniques on a sample set of digitized slides from the collection in an attempt to solve this issue. One experiment will use Optical Character Recognition software to automatically read and catalog the slide labels. Another will adapt computer vision and machine learning processes to automatically detect whether a slide image is copied from a book, or if it is an original photograph. The results of these experiments will be compared with manual transcription cataloging for the slides to assess the efficacy of the processes. At the end of the project, the Media Center will produce a white paper, along with a website, documenting results of these experiments and exploring the feasibility of replicating these techniques for other collections.

This project was made possible in part by the Institute of Museum and Library Services Grant LG-89-17-0218-17.

This website will be periodically updated with news and more information about the project.