Faculty of Informatics Vienna University of Technology Institute of Computer Aided Automation PRIP Home PRIP Home
Personal tools
You are here: Home Teaching PRIP-Talks Markus Diem: Recognizing Degraded Handwritten Characters
Navigation
 

Markus Diem: Recognizing Degraded Handwritten Characters

Abschlussvortrag DA

What
  • Presentation
When Mar 04, 2010
from 02:15 pm to 02:35 pm
Where Sem 183/2
Add event to calendar vCal
iCal
In this thesis a new character recognition system is proposed that can handle degraded manuscript documents which were discovered at the St. Catherine's Monastery. In contrast to state of the art OCR systems, no early decision namely the image binarization needs to be performed. Thus, an object recognition methodology is adapted for the recognition of ancient manuscripts. Therefore interest points are extracted which allow for the computation of local descriptors. These are directly classified using a SVM with one against all tests.

In order to localize characters interest points that represent whole characters are found by means of a scale distribution histogram. Then the remaining interest points are clustered using a k-means which is initialized with the previously selected interest points. Finally a voting scheme is applied where the local descriptors' class probabilities which were assigned after the classification are accumulated to a single class probability histogram of each character cluster. This histogram does not solely allow for a hard decision, but can be presented to human experts who can decide the character class for hardly readable characters according to the probabilities obtained.

The system was evaluated on three different dataset namely a synthetic with Latin script, degraded characters and real world data. The system achieves a F0.5-score of 0.77 on the last dataset mentioned.
Document Actions