Huge amounts of handwritten historical documents are being published by digital libraries world wide. However, for these raw digital images to be really useful, they need to be annotated with informative content. State-of-the-art Handwritten Text Recognition (HTR) approaches require an impressive training effort by expert paleographers. Our contribution is a scalable, end-to-end transcription work-flow – that we call In Codice Ratio – based on fine-grain segmentation of text elements into characters and symbols, with limited training effort. We provide a preliminary evaluation of In Codice Ratio over a corpus of letters by pope Honorii III, stored in the Vatican Secret Archive.
Ammirati, S., Firmani, D., Maiorino, M., Merialdo, P., Nieddu, E., Rossi, A. (2017). In Codice Ratio: Scalable Transcription of Historical Handwritten Documents. In 25th Italian Symposium on Advanced Database Systems (SEBD). CEUR-WS.
In Codice Ratio: Scalable Transcription of Historical Handwritten Documents
Ammirati Serena;Firmani Donatella;Maiorino Marco;Merialdo Paolo;Nieddu Elena;Rossi Andrea
2017-01-01
Abstract
Huge amounts of handwritten historical documents are being published by digital libraries world wide. However, for these raw digital images to be really useful, they need to be annotated with informative content. State-of-the-art Handwritten Text Recognition (HTR) approaches require an impressive training effort by expert paleographers. Our contribution is a scalable, end-to-end transcription work-flow – that we call In Codice Ratio – based on fine-grain segmentation of text elements into characters and symbols, with limited training effort. We provide a preliminary evaluation of In Codice Ratio over a corpus of letters by pope Honorii III, stored in the Vatican Secret Archive.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.