The British Library, one of the worlds great repositories of historic information, has scanned a million images from books published from the 17th-19th centuries. The public domain images have all been uploaded to a Flickr account for the public to use, remix and repurpose.
According to the Library:
The images were plucked from the pages as part of the ‘Mechanical Curator’, a creation of the British Library Labs project. Each image is individually addressible, online, and Flickr provies an API to access it and the image’s associated description.
The plan is to continue to expand upon this work in the hope that public use and crowdsourced knowledge will help to train an automated, presumably artificial intelligence, classifier.
We plan to launch a crowdsourcing application at the beginning of next year, to help describe what the images portray. Our intention is to use this data to train automated classifiers that will run against the whole of the content. The data from this will be as openly licensed as is sensible (given the nature of crowdsourcing) and the code, as always, will be under an open licence.
The manifests of images, with descriptions of the works that they were taken from, are available on github and are also released under a public-domain ‘licence’. This set of metadata being on github should indicate that we fully intend people to work with it, to adapt it, and to push back improvements that should help others work with this release.
There are very few datasets of this nature free for any use and by putting it online we hope to stimulate and support research concerning printed illustrations, maps and other material not currently studied. Given that the images are derived from just 65,000 volumes and that the library holds many millions of items.
All of this will, hopefully, make it easier for researchers and the public to quickly find information they are looking for from an enormous collection of material. Meanwhile, if you are into history there is a fountain of it here to explore and play with.
via Boing Boing