The datasets and their corresponding ground-truth used in tranScriptorium can be downloaded from this web page for research purposes. For more information, follow the link of the collection.
The Bentham collection consists of a set of images of a collection of works on law and moral philosophy written by the philosopher Jeremy Bentham.
The Hattem collection is a Dutch document that belongs to the “Artes Liberales” literature, from 15th century and belongs to the medical domain.
The Plantas collection is a series of 7 books written by Bernardo de Cienfuegos in the XVII century. It is written in Spanish and it also contains many drawings.
The Reichsgericht collection is a selection of court decisions from the German High Court (Reichsgericht) from 1900-1914.
The Wiensanktulrich collection is a selection of documents with birth records from 16th century written in Wien.