Handwritting: keyword spotting The Query by Example (QbE) case

Zagoris, Konstantinos; Pratikakis, Ioannis; Barlas, Georgios

Handwritting: keyword spotting The Query by Example (QbE) case

dc.contributor.author	Barlas, Georgios
dc.contributor.author	Zagoris, Konstantinos
dc.contributor.author	Pratikakis, Ioannis
dc.date.accessioned	2021-02-12T08:49:44Z
dc.date.available	2021-02-12T08:49:44Z
dc.date.issued	2017-07-21
dc.identifier.isbn	978-1-53611-937-4
dc.identifier.uri	http://hdl.handle.net/11728/11652
dc.description.abstract	The traditional approach in document indexing usually involves an Optical Character Recognition (OCR) step. Although OCR performs well in modern printed documents and documents of high quality printing, in the case of handwritten documents OCR, several factors affect the final performance like intense degradation, paper-positioning variations (skew, translations, etc.) and writing styles variety. Handwritten word spotting has attracted the attention of the research community in the field of document image analysis and recognition since it appears to be a feasible solution for indexing and retrieval of handwritten documents in the case that OCR-based methods fail to deliver satisfactory results. Handwritten keyword spotting (KWS) is the task of retrieving all instances of a given query word in handwritten document image collections without involving a traditional OCR step. There exist two basic variations for KWS approaches: (a) the Query by Example case (QbE) where the query is a word image and (b) the Query by String case (QbS) where, as the name implies, the query is a string. The study presented in this chapter will focus on the QbE approach For a better understanding, QbE methods will be presented taking into account two different perspectives which relate to the use of segmentation and learning. The segmentation- based methods are divided into 2 subcategories based upon the segmented entity which could be either the word image or the textline. They are strongly dependent on the segmentation step, so that to compare different methods regardless of segmentation errors, many researchers do not implement a segmentation method but they use datasets where the segments are given. In the case of segmentation-free methods the whole image is tested against similarities between the query image and the patches of the document image without segmenting it at any level. The methods of this class, on the one hand bypass the step of segmentation but on the other hand they cannot avoid searching for the words in parts of the image that may not contain text. Therefore, segmentation-free methods avoid failures due to bad segmentation but the running time increases considerably. It is worth-mentioning that the methods of this class are not the trend. Training-based methods are those that require training data at a particular stage of the process. A common problem in these methods is the availability of training data. Further- more, an extra weakness is that to apply such a method to a new word, usually ground truthing work is required to obtain training data, which is quite time consuming and often it has to be done totally manual. Training - free are methods that as the name implies do not include any training stage in the operational KWS pipeline. The training - free methods can be applied directly to new word although, they usually require a particular configuration to be effective in the corresponding text. This chapter is structured as follows: Section “Segmentation-based Context” will present the KWS methodologies that operate in a segmentation-based context wherein methods based on training and methods that are independent of any training involvement will be detailed. Both variations will be separately reviewed depending on the type of segmentation which is used. In Section “Segmentation - Free Context”, methodologies that do account for a segmentation will be discussed with a particular focus on the use or not of training. Section “Experimental Datasets and Evaluation Metrics” deals with an overview of the current efforts for performance evaluation and a brief description of datasets that were used in QbE KWS, while the Section “Conclusive Remarks” is dedicated to a fruitful discussion which aims to identify the current trends of the QbE KWS.	en_UK
dc.language.iso	en	en_UK
dc.publisher	Nova Science Publishers, Inc	en_UK
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	en_UK
dc.subject	Research Subject Categories::TECHNOLOGY::Information technology	en_UK
dc.subject	Handwritten Key word spotting (KWS)	en_UK
dc.subject	QBE query	en_UK
dc.subject	Segmentation	en_UK
dc.title	Handwritting: keyword spotting The Query by Example (QbE) case	en_UK
dc.title.alternative	Handwriting: Recognition, Development and Analysis. Chapter 11	en_UK
dc.type	Book chapter	en_UK

Files in this item

Name:: Handwritten keyword spotting_2 ...
Size:: 1.708Mb
Format:: PDF
Description:: Download Book Chapter

This item appears in the following Collection(s)

Book chapters8
Κεφάλαια βιβλίων

Show simple item record

Except where otherwise noted, this item's license is described as http://creativecommons.org/licenses/by-nc-nd/4.0/

Handwritting: keyword spotting The Query by Example (QbE) case

Files in this item

This item appears in the following Collection(s)

Book chapters8