Recognizing lines of unconstrained handwritten text is a challenging task. The difficulty of segmenting cursive or overlapping characters, combined with the need to exploit surrounding context, has ...led to low recognition rates for even the best current recognizers. Most recent progress in the field has been made either through improved preprocessing or through advances in language modeling. Relatively little work has been done on the basic recognition algorithms. Indeed, most systems rely on the same hidden Markov models that have been used for decades in speech and handwriting recognition, despite their well-known shortcomings. This paper proposes an alternative approach based on a novel type of recurrent neural network, specifically designed for sequence labeling tasks where the data is hard to segment and contains long-range bidirectional interdependencies. In experiments on two large unconstrained handwriting databases, our approach achieves word recognition accuracies of 79.7 percent on online data and 74.1 percent on offline data, significantly outperforming a state-of-the-art HMM-based system. In addition, we demonstrate the network's robustness to lexicon size, measure the individual influence of its hidden layers, and analyze its use of context. Last, we provide an in-depth discussion of the differences between the network and HMMs, suggesting reasons for the network's superior performance.
Keyword spotting refers to the process of retrieving all instances of a given keyword from a document. In the present paper, a novel keyword spotting method for handwritten documents is described. It ...is derived from a neural network-based system for unconstrained handwriting recognition. As such it performs template-free spotting, i.e., it is not necessary for a keyword to appear in the training set. The keyword spotting is done using a modification of the CTC Token Passing algorithm in conjunction with a recurrent neural network. We demonstrate that the proposed systems outperform not only a classical dynamic time warping-based approach but also a modern keyword spotting system, based on hidden Markov models. Furthermore, we analyze the performance of the underlying neural networks when using them in a recognition task followed by keyword spotting on the produced transcription. We point out the advantages of keyword spotting when compared to classic text line recognition.
Investigates the influence of the cost function on the optimal match between two graphs. It is shown that, for a given cost function, there are an infinite number of other cost functions that lead, ...for any given pair of graphs, to the same optimal error correcting matching. Furthermore, it is shown that well-known concepts from graph theory, such as graph isomorphism, subgraph isomorphism, and maximum common subgraph, are special cases of optimal error correcting graph matching under particular cost functions.
In approximate, or error-correcting, graph matching one considers a set of graph edit operations, and defines the edit distance of two graphs g1 and g2 as the shortest (or least cost) sequence of ...edit operations that transform g1 into g2. A maximum common subgraph of two graphs g1 and g2 is a subgraph of both g1 and g2 such that there is no other subgraph of g1 and g2 with more nodes. Graph edit distance and maximum common subgraph are well known concepts that have various applications in pattern recognition and machine vision. In this paper a particular cost function for graph edit distance is introduced, and it is shown that under this cost function graph edit distance computation is equivalent to the maximum common subgraph problem.
Full text
Available for:
IJS, IMTLJ, KILJ, KISLJ, NUK, SBCE, SBJE, UL, UM, UPCLJ, UPUK
Unconstrained off-line continuous handwritten text recognition is a very challenging task which has been recently addressed by different promising techniques. This work presents our latest ...contribution to this task, integrating neural network language models in the decoding process of three state-of-the-art systems: one based on bidirectional recurrent neural networks, another based on hybrid hidden Markov models and, finally, a combination of both. Experimental results obtained on the IAM off-line database demonstrate that consistent word error rate reductions can be achieved with neural network language models when compared with statistical N-gram language models on the three tested systems. The best word error rate, 16.1%, reported with ROVER combination of systems using neural network language models significantly outperforms current benchmark results for the IAM database.
•We study the use of neural network language models for two state-of-the-art recognizers for unconstrained off-line HTR.•We found consistent improvement when using this language model, combined or not with standard N-grams language models.•The neural network language model scales well with different dictionary sizes for the IAM-DB task.•By combining the two recognition systems, unprecedented accuracy for the IAM database is reported.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK
The median graph has been presented as a useful tool to represent a set of graphs. Nevertheless its computation is very complex and the existing algorithms are restricted to use limited amount of ...data. In this paper we propose a new approach for the computation of the median graph based on graph embedding. Graphs are embedded into a vector space and the median is computed in the vector domain. We have designed a procedure based on the weighted mean of a pair of graphs to go from the vector domain back to the graph domain in order to obtain a final approximation of the median graph. Experiments on three different databases containing large graphs show that we succeed to compute good approximations of the median graph. We have also applied the median graph to perform some basic classification tasks achieving reasonable good results. These experiments on real data open the door to the application of the median graph to a number of more complex machine learning algorithms where a representative of a set of graphs is needed.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK
This paper presents a system for the offline recognition of large vocabulary unconstrained handwritten texts. The only assumption made about the data is that it is written in English. This allows the ...application of statistical language models in order to improve the performance of our system. Several experiments have been performed using both single and multiple writer data. Lexica of variable size (from 10,000 to 50,000 words) have been used. The use of language models is shown to improve the accuracy of the system (when the lexicon contains 50,000 words, the error rate is reduced by /spl sim/50 percent for single writer data and by /spl sim/25 percent for multiple writer data). Our approach is described in detail and compared with other methods presented in the literature to deal with the same problem. An experimental setup to correctly deal with unconstrained text recognition is proposed.
Researchers from various disciplines such as pattern recognition, statistics, and machine learning have explored the use of ensemble methodology since the late seventies. Thus, they are faced with a ...wide variety of methods, given the growing interest in the field. This book aims to impose a degree of order upon this diversity by presenting a coherent and unified repository of ensemble methods, theories, trends, challenges and applications.
In object prototype learning and similar tasks, median computation is an important technique for capturing the essential information of a given set of patterns. We extend the median concept to the ...domain of graphs. In terms of graph distance, we introduce the novel concepts of set median and generalized median of a set of graphs. We study properties of both types of median graphs. For the more complex task of computing generalized median graphs, a genetic search algorithm is developed. Experiments conducted on randomly generated graphs demonstrate the advantage of generalized median graphs compared to set median graphs and the ability of our genetic algorithm to find approximate generalized median graphs in reasonable time. Application examples with both synthetic and nonsynthetic data are shown to illustrate the practical usefulness of the concept of median graphs.