HTR crowdsourcing platform launched

The HTR technology being developed for tranScriptorium offers a number of exciting possibilities when it comes to transcription, not least of which is when it comes to crowdsourcing. Based on our experience of running Transcribe Bentham – which crowdsources the transcription of the manuscripts of the English philosopher and reformer, Jeremy Bentham (1748 – 1832) – a major barrier to participation is the inability to decipher historic handwriting.

We are now testing TSX, a crowdsourced transcription platform developed by the University of London Computer Centre, which by incorporating HTR technology into the platform seeks to assist volunteers with their transcription. TSX allows for participation in three interconnected ways, depending upon the user’s level of experience, preference, and/or amount of available free time.

First, users can transcribe a manuscript without any assistance from the HTR technology, though while still taking advantage of useful features of TSX including the segmentation of manuscript images into lines, and colour-coded TEI mark-up.

Transcription and encoding, without using HTR technology

Second, if the user is new to transcription or doesn’t have much spare time, they can request from the HTR engine a full transcript of a given manuscript. As it is unlikely that any HTR transcript will ever be entirely right, then the user can correct this against the manuscript image.

HTR transcript correction

And third, and perhaps most excitingly, the user can request from the HTR engine suggestions for words which they might not otherwise be able to decipher and reduce, to an extent, the frustration of trying to read historic handwriting.

TSX word suggestions

TSX runs off the infrastructure provided by the Transkribus transcription management and transcription tool. Future development work for TSX will include the introduction of a What-You-See-Is-What-You-Get transcription interface, removing the need for the transcriber to deal with visible TEI mark-up and to concentrate fully on transcription.

If you would like to try out transcribing manuscripts using TSX , please do visit the website to register an account to get started. All feedback will be very gratefully received!