String matching is one of the oldest algorithmic techniques, yet still one of the most pervasive in computer science. The past 20 years have seen technological leaps in applications as diverse as information retrieval and compression. This copiously illustrated collection of puzzles and exercises in key areas of text algorithms and combinatorics on words offers graduate students and researchers a pleasant and direct way to learn and practice with advanced concepts. The problems are drawn from a large range of scientific publications, both classic and new. Building up from the basics, the book goes on to showcase problems in combinatorics on words (including Fibonacci or Thue-Morse words), pattern matching (including Knuth-Morris-Pratt and Boyer-Moore like algorithms), efficient text data structures (including suffix trees and suffix arrays), regularities in words (including periods and runs) and text compression (including Huffman, Lempel-Ziv and Burrows-Wheeler based methods).
The term “stringology” is a popular nickname for text algorithms, or algorithms on strings. This book deals with the most basic algorithms in the area. Most of them can be viewed as “algorithmic jewels” and deserve reader-friendly presentation. One of the main aims of the book is to present several of the most celebrated algorithms in a simple way by omitting obscuring details and separating algorithmic structure from combinatorial theoretical background. The book reflects the relationships between applications of text-algorithmic techniques and the classification of algorithms according to the measures of complexity considered. The text can be viewed as a parade of algorithms in which the main purpose is to discuss the foundations of the algorithms and their interconnections. One can partition the algorithmic problems discussed into practical and theoretical problems. Certainly, string matching and data compression are in the former class, while most problems related to symmetries and repetitions in texts are in the latter. However, all the problems are interesting from an algorithmic point of view and enable the reader to appreciate the importance of combinatorics on words as a tool in the design of efficient text algorithms.In most textbooks on algorithms and data structures, the presentation of efficient algorithms on words is quite short as compared to issues in graph theory, sorting, searching, and some other areas. At the same time, there are many presentations of interesting algorithms on words accessible only in journals and in a form directed mainly at specialists. This book fills the gap in the book literature on algorithms on words, and brings together the many results presently dispersed in the masses of journal articles. The presentation is reader-friendly; many examples and about two hundred figures illustrate nicely the behaviour of otherwise very complex algorithms.
This much-needed book on the design of algorithms and data structures for text processing emphasizes both theoretical foundations and practical applications. It is intended to serve both as a textbook for courses on algorithm design, especially those related to text processing, and as a reference for computer science professionals. The work takes a unique approach, one that goes more deeply into its topic than other more general books. It contains both classical algorithms and recent results of research on the subject. The book is the first text to contain a collection of a wide range of text algorithms, many of them quite new and appearing here for the first time. Other algorithms, while known by reputation, have never been published in the journal literature. Two such important algorithms are those of Karp, Miller and Rosenberg, and that of Weiner. Here they are presented together for the fist time. The core of the book is the material on suffix trees and subword graphs, applications of these data structures, new approaches to time-space optimal string-matching, and text compression. Also covered are basic parallel algorithms for text problems. Applications of all these algorithms are given for problems involving data retrieval systems, treatment of natural languages, investigation of genomes, data compression software, and text processing tools. From the theoretical point of view. the book is a goldmine of paradigms for the development of efficient algorithms, providing the necessary foundation to creating practical software dealing with sequences. A crucial point in the authors' approach is the development of a methodology for presenting text algorithms so they can be fully understood. Throughout, the book emphasizes the efficiency of algorithms, holding that the essence of their usefulness depends on it. This is especially important since the algorithms described here will find application in "Big Science" areas like molecular sequence analysis where the explosive growth of data has caused problems for the current generation of software. Finally, with its development of theoretical background, the book can be considered as a mathematical foundation for the analysis and production of text processing algorithms.
This book constitutes the refereed proceedings of the 16th Annual Symposium on Combinatorial Pattern Matching, CPM 2005, held in Jeju island, Korea on June 19-22, 2005. The 37 revised full papers presented were carefully reviewed and selected from 129 submissions. They constitute original research contributions in combinatorial pattern matching and its applications. Among the application fields addressed are computational biology, bioinformatics, genomics, proteinomics, data compression, Sequence Analysis and Graphs, information retrieval, data analysis, and pattern recognition.
This volume presents the proceedings of the Fifth Annual Symposium on Combinatorial Pattern Matching, held at Asilomar, California, in June 1994. The 26 selected papers in this volume are organized in chapters on Alignments, Various Matchings, Combinatorial Aspects, and Bio-Informatics. Combinatorial Pattern Matching addresses issues of searching and matching of strings and more complicated patterns, as for example trees. The goal is to derive non-trivial combinatorial properties for such structures and then to exploit these properties in order to achieve superior performance for the corresponding computational problems. In recent years, combinatorial pattern matching has developed into a full-fledged area of algorithmics and is expected to grow even further during the next years.
This book constitutes the refereed proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching, CPM 99, held in Warwick, UK in July 1999. The 21 revised papers presented were carefully reviewed and selected from 26 submissions. The papers address all current issues in combinatorial pattern matching dealing with a variety of classical objects like trees, regular expressions, graphs, point sets, and arrays as well as with DNA/RNA coding, WWW issues, information retrieval, data compression, and pattern recognition.
This volume presents the proceedings of the Fifth Annual Symposium on Combinatorial Pattern Matching, held at Asilomar, California, in June 1994. The 26 selected papers in this volume are organized in chapters on Alignments, Various Matchings, Combinatorial Aspects, and Bio-Informatics. Combinatorial Pattern Matching addresses issues of searching and matching of strings and more complicated patterns, as for example trees. The goal is to derive non-trivial combinatorial properties for such structures and then to exploit these properties in order to achieve superior performance for the corresponding computational problems. In recent years, combinatorial pattern matching has developed into a full-fledged area of algorithmics and is expected to grow even further during the next years.
This book constitutes the refereed proceedings of the 10th Annual Symposium on Combinatorial Pattern Matching, CPM 99, held in Warwick, UK in July 1999. The 21 revised papers presented were carefully reviewed and selected from 26 submissions. The papers address all current issues in combinatorial pattern matching dealing with a variety of classical objects like trees, regular expressions, graphs, point sets, and arrays as well as with DNA/RNA coding, WWW issues, information retrieval, data compression, and pattern recognition.
This book is a monograph on unitals embedded in ?nite projective planes. Unitals are an interesting structure found in square order projective planes, and numerous research articles constructing and discussing these structures have appeared in print. More importantly, there still are many open pr- lems, and this remains a fruitful area for Ph.D. dissertations. Unitals play an important role in ?nite geometry as well as in related areas of mathematics. For example, unitals play a parallel role to Baer s- planes when considering extreme values for the size of a blocking set in a square order projective plane (see Section 2.3). Moreover, unitals meet the upper bound for the number of absolute points of any polarity in a square order projective plane (see Section 1.5). From an applications point of view, the linear codes arising from unitals have excellent technical properties (see 2 Section 6.4). The automorphism group of the classical unitalH =H(2,q ) is 2-transitive on the points ofH, and so unitals are of interest in group theory. In the ?eld of algebraic geometry over ?nite ?elds,H is a maximal curve that contains the largest number of F -rational points with respect to its genus, 2 q as established by the Hasse-Weil bound.
The term ?stringology? is a popular nickname for text algorithms, or algorithms on strings. This book deals with the most basic algorithms in the area. Most of them can be viewed as ?algorithmic jewels? and deserve reader-friendly presentation. One of the main aims of the book is to present several of the most celebrated algorithms in a simple way by omitting obscuring details and separating algorithmic structure from combinatorial theoretical background. The book reflects the relationships between applications of text-algorithmic techniques and the classification of algorithms according to the measures of complexity considered. The text can be viewed as a parade of algorithms in which the main purpose is to discuss the foundations of the algorithms and their interconnections. One can partition the algorithmic problems discussed into practical and theoretical problems. Certainly, string matching and data compression are in the former class, while most problems related to symmetries and repetitions in texts are in the latter. However, all the problems are interesting from an algorithmic point of view and enable the reader to appreciate the importance of combinatorics on words as a tool in the design of efficient text algorithms.In most textbooks on algorithms and data structures, the presentation of efficient algorithms on words is quite short as compared to issues in graph theory, sorting, searching, and some other areas. At the same time, there are many presentations of interesting algorithms on words accessible only in journals and in a form directed mainly at specialists. This book fills the gap in the book literature on algorithms on words, and brings together the many results presently dispersed in the masses of journal articles. The presentation is reader-friendly; many examples and about two hundred figures illustrate nicely the behaviour of otherwise very complex algorithms.
The book is intended for lectures on string processes and pattern matching in Master's courses of computer science and software engineering curricula. The details of algorithms are given with correctness proofs and complexity analysis, which make them ready to implement. Algorithms are described in a C-like language. The book is also a reference for students in computational linguistics or computational biology. It presents examples of questions related to the automatic processing of natural language, to the analysis of molecular sequences, and to the management of textual databases.
This much-needed book on the design of algorithms and data structures for text processing emphasizes both theoretical foundations and practical applications. It is intended to serve both as a textbook for courses on algorithm design, especially those related to text processing, and as a reference for computer science professionals. The work takes a unique approach, one that goes more deeply into its topic than other more general books. It contains both classical algorithms and recent results of research on the subject. The book is the first text to contain a collection of a wide range of text algorithms, many of them quite new and appearing here for the first time. Other algorithms, while known by reputation, have never been published in the journal literature. Two such important algorithms are those of Karp, Miller and Rosenberg, and that of Weiner. Here they are presented together for the fist time. The core of the book is the material on suffix trees and subword graphs, applications of these data structures, new approaches to time-space optimal string-matching, and text compression. Also covered are basic parallel algorithms for text problems. Applications of all these algorithms are given for problems involving data retrieval systems, treatment of natural languages, investigation of genomes, data compression software, and text processing tools. From the theoretical point of view. the book is a goldmine of paradigms for the development of efficient algorithms, providing the necessary foundation to creating practical software dealing with sequences. A crucial point in the authors' approach is the development of a methodology for presenting text algorithms so they can be fully understood. Throughout, the book emphasizes the efficiency of algorithms, holding that the essence of their usefulness depends on it. This is especially important since the algorithms described here will find application in "Big Science" areas like molecular sequence analysis where the explosive growth of data has caused problems for the current generation of software. Finally, with its development of theoretical background, the book can be considered as a mathematical foundation for the analysis and production of text processing algorithms.
Thank you for visiting our website. Would you like to provide feedback on how we could improve your experience?
This site does not use any third party cookies with one exception — it uses cookies from Google to deliver its services and to analyze traffic.Learn More.