Handwritten Text Recognition: Word-Graphs, Keyword Spotting and Computer Assisted Transcription
Tutorial Slides (Draft)
- I- Introduction
- icfhrHTRTut2014-I-2p.pdf (0.8M)
- II- Handwritten Text Recognition (HTR)
- icfhrHTRTut2014-II-2p.pdf (4.8M)
- III- Basic HTR experiments
- icfhrHTRTut2014-III-2p.pdf (0.2M)
- IV- Word-Graphs (WG)
- icfhrHTRTut2014-IV-2p.pdf (0.8M)
- V- WG Applications: Computer-Assisted Transcription of Text Images (CATTI)
- icfhrHTRTut2014-V-2p.pdf (0.4M)
- VI- WG Applications: Keyword Spotting (KWS) in Large Manuscript Collections
- icfhrHTRTut2014-VI-2p.pdf (9.3M)
- Complete CATTI and KWS systems in real HTR tasks:
- CATTI demonstrations (old GUI instructions, new GUI instructions)
- KWS demonstrations
Practice GuideThe aim of this practice guide is that the students get familiar with the use of HTK (Hidden Markov Model ToolKit) applied in handwritten text recognition (HTR). Likewise, it will be show how to obtain word-graphs from the HTR decoding process and how them can be used for parameter optimization or for decoding using n-gram models with n>2 (re-scoring). In addition, brief explanations about the use of some homemade tools for image preprocessing and feature extraction implemented for HTR will be given.
By far the most important software in this practice is "The Hidden Markov Model Toolkit (HTK), version 3.4.1", which (including its documentation) can be downloaded from http://htk.eng.cam.ac.uk.
In addition, in order to train n-grams language models, the software SRI Language Modeling Toolkit (SRILM) is required.
Furthermore, as this practice is completely developed in Linux, it is assumed that there is a prior knowledge and experience using this operating system and handling the standard GNU-Linux tools such as bash, awk, netpbm, xv, etc.
It is strongly recommended that each attendant of this tutorial brings his/her own Linux laptop with the above tools (HTK and SRILM in particular) already installed.
Guide: Exp-Guide.pdf (177K)
Bentham Databset, BenthamData.tar.bz2 (185M)
HTR processing tools, HTR-toolsUtils.tar.bz2 (55.3K)
Dr. Moises Pastor, Dr. Joan Andreu Sánchez, Dr. Alejandro H. Toselli and Dr. Enrique Vidal.