In Monitoring Adaptive Spoken Dialog Systems, authors Alexander Schmitt and Wolfgang Minker investigate statistical approaches that allow for recognition of negative dialog patterns in Spoken Dialog Systems (SDS). The presented stochastic methods allow a flexible, portable and accurate use. Beginning with the foundations of machine learning and pattern recognition, this monograph examines how frequently users show negative emotions in spoken dialog systems and develop novel approaches to speech-based emotion recognition using hybrid approach to model emotions. The authors make use of statistical methods based on acoustic, linguistic and contextual features to examine the relationship between the interaction flow and the occurrence of emotions using non-acted recordings several thousand real users from commercial and non-commercial SDS. Additionally, the authors present novel statistical methods that spot problems within a dialog based on interaction patterns. The approaches enable future SDS to offer more natural and robust interactions. This work provides insights, lessons and inspiration for future research and development, not only for spoken dialog systems, but for data-driven approaches to human-machine interaction in general.
Proactive Spoken Dialogue Interaction in Multi-Party Environments describes spoken dialogue systems that act as independent dialogue partners in the conversation with and between users. The resulting novel characteristics such as proactiveness and multi-party capabilities pose new challenges on the dialogue management component of such a system and require the use and administration of an extensive dialogue history. In order to assist the proactive spoken dialogue systems development, a comprehensive data collection seems mandatory and may be performed in a Wizard-of-Oz environment. Such an environment builds also the appropriate basis for an extensive usability and acceptance evaluation. Proactive Spoken Dialogue Interaction in Multi-Party Environments is a useful reference for students and researchers in speech processing.
Speech and Human-Machine Dialog focuses on the dialog management component of a spoken language dialog system. Spoken language dialog systems provide a natural interface between humans and computers. These systems are of special interest for interactive applications, and they integrate several technologies including speech recognition, natural language understanding, dialog management and speech synthesis. Due to the conjunction of several factors throughout the past few years, humans are significantly changing their behavior vis-à-vis machines. In particular, the use of speech technologies will become normal in the professional domain, and in everyday life. The performance of speech recognition components has also significantly improved. This book includes various examples that illustrate the different functionalities of the dialog model in a representative application for train travel information retrieval (train time tables, prices and ticket reservation). Speech and Human-Machine Dialog is designed for a professional audience, composed of researchers and practitioners in industry. This book is also suitable as a secondary text for graduate-level students in computer science and engineering.
This book provides a detailed and up-to-date overview on classification and data mining methods. The first part is focused on supervised classification algorithms and their applications, including recent research on the combination of classifiers. The second part deals with unsupervised data mining and knowledge discovery, with special attention to text mining. Discovering the underlying structure on a data set has been a key research topic associated to unsupervised techniques with multiple applications and challenges, from web-content mining to the inference of cancer subtypes in genomic microarray data. Among those, the book focuses on a new application for dialog systems which can be thereby made adaptable and portable to different domains. Clustering evaluation metrics and new approaches, such as the ensembles of clustering algorithms, are also described.
Reasoning for Information: Seeking and Planning Dialogues provides a logic-based reasoning component for spoken language dialogue systems. This component, called Problem Assistant is responsible for processing constraints on a possible solution obtained from various sources, namely user and the system's domain-specific information. The authors also present findings on the implementation of a dialogue management interface to the Problem Assistant. The dialogue system supports simple mixed-initiative planning interactions in the TRAINS domain, which is still a relatively complex domain involving a number of logical constraints and relations forming the basis for the collaborative problem-solving behavior that drives the dialogue.
Adaptive Multimodal Interactive Systems introduces a general framework for adapting multimodal interactive systems and comprises a detailed discussion of each of the steps required for adaptation. This book also investigates how interactive systems may be improved in terms of usability and user friendliness while describing the exhaustive user tests employed to evaluate the presented approaches. After introducing general theory, a generic approach for user modeling in interactive systems is presented, ranging from an observation of basic events to a description of higher-level user behavior. Adaptations are presented as a set of patterns similar to those known from software or usability engineering.These patterns describe recurring problems and present proven solutions. The authors include a discussion on when and how to employ patterns and provide guidance to the system designer who wants to add adaptivity to interactive systems. In addition to these patterns, the book introduces an adaptation framework, which exhibits an abstraction layer using Semantic Web technology.Adaptations are implemented on top of this abstraction layer by creating a semantic representation of the adaptation patterns. The patterns cover both graphical interfaces as well as speech-based and multimodal interactive systems.
Introducing Spoken Dialogue Systems into Intelligent Environments outlines the formalisms of a novel knowledge-driven framework for spoken dialogue management and presents the implementation of a model-based Adaptive Spoken Dialogue Manager(ASDM) called OwlSpeak. The authors have identified three stakeholders that potentially influence the behavior of the ASDM: the user, the SDS, and a complex Intelligent Environment (IE) consisting of various devices, services, and task descriptions. The theoretical foundation of a working ontology-based spoken dialogue description framework, the prototype implementation of the ASDM, and the evaluation activities that are presented as part of this book contribute to the ongoing spoken dialogue research by establishing the fertile ground of model-based adaptive spoken dialogue management. This monograph is ideal for advanced undergraduate students, PhD students, and postdocs as well as academic and industrial researchers and developers in speech and multimodal interactive systems.
Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standard Arabic (MSA) speech data can be applied to dialectal Arabic speech recognition, while assuming that MSA is always a second language for all Arabic speakers. In this book, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers, and a high quality ECA speech corpus with accurate phonetic transcription has been collected. MSA acoustic models were trained using news broadcast speech. In order to cross-lingually use MSA in dialectal Arabic speech recognition, the authors have normalized the phoneme sets for MSA and ECA. After this normalization, they have applied state-of-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and Maximum A-Posteriori (MAP) to adapt existing phonemic MSA acoustic models with a small amount of dialectal ECA speech data. Speech recognition results indicate a significant increase in recognition accuracy compared to a baseline model trained with only ECA data.
In this book, a novel approach that combines speech-based emotion recognition with adaptive human-computer dialogue modeling is described. With the robust recognition of emotions from speech signals as their goal, the authors analyze the effectiveness of using a plain emotion recognizer, a speech-emotion recognizer combining speech and emotion recognition, and multiple speech-emotion recognizers at the same time. The semi-stochastic dialogue model employed relates user emotion management to the corresponding dialogue interaction history and allows the device to adapt itself to the context, including altering the stylistic realization of its speech. This comprehensive volume begins by introducing spoken language dialogue systems and providing an overview of human emotions, theories, categorization and emotional speech. It moves on to cover the adaptive semi-stochastic dialogue model and the basic concepts of speech-emotion recognition. Finally, the authors show how speech-emotion recognizers can be optimized, and how an adaptive dialogue manager can be implemented. The book, with its novel methods to perform robust speech-based emotion recognition at low complexity, will be of interest to a variety of readers involved in human-computer interaction.
This book addresses the problem of separating spontaneous multi-party speech by way of microphone arrays (beamformers) and adaptive signal processing techniques. It is written is a concise manner and an effort has been made such that all presented algorithms can be straightforwardly implemented by the reader. All experimental results have been obtained with real in-car microphone recordings involving simultaneous speech of the driver and the co-driver.
In this book, hierarchical structures based on neural networks are investigated for automatic speech recognition. These structures are mainly evaluated within the phoneme recognition task under the Hybrid Hidden Markov Model/Artificial Neural Network (HMM/ANN) paradigm. The baseline hierarchical scheme consists of two levels each which is based on a Multilayered Perceptron (MLP). Additionally, the output of the first level is used as an input for the second level. This system can be substantially speeded up by removing the redundant information contained at the output of the first level.
Incorporating Knowledge Sources into Statistical Speech Recognition addresses the problem of developing efficient automatic speech recognition (ASR) systems, which maintain a balance between utilizing a wide knowledge of speech variability, while keeping the training / recognition effort feasible and improving speech recognition performance. The book provides an efficient general framework to incorporate additional knowledge sources into state-of-the-art statistical ASR systems. It can be applied to many existing ASR problems with their respective model-based likelihood functions in flexible ways.
Current speech recognition systems are based on speaker independent speech models and suffer from inter-speaker variations in speech signal characteristics. This work develops an integrated approach for speech and speaker recognition in order to gain space for self-learning opportunities of the system. This work introduces a reliable speaker identification which enables the speech recognizer to create robust speaker dependent models In addition, this book gives a new approach to solve the reverse problem, how to improve speech recognition if speakers can be recognized. The speaker identification enables the speaker adaptation to adapt to different speakers which results in an optimal long-term adaptation.
In this work, the authors present a fully statistical approach to model non--native speakers' pronunciation. Second-language speakers pronounce words in multiple different ways compared to the native speakers. Those deviations, may it be phoneme substitutions, deletions or insertions, can be modelled automatically with the new method presented here. The methods is based on a discrete hidden Markov model as a word pronunciation model, initialized on a standard pronunciation dictionary. The implementation and functionality of the methodology has been proven and verified with a test set of non-native English in the regarding accent. The book is written for researchers with a professional interest in phonetics and automatic speech and speaker recognition.
Stochastically-Based Semantic Analysis investigates the problem of automatic natural language understanding in a spoken language dialog system. The focus is on the design of a stochastic parser and its evaluation with respect to a conventional rule-based method. Stochastically-Based Semantic Analysis will be of most interest to researchers in artificial intelligence, especially those in natural language processing, computational linguistics, and speech recognition. It will also appeal to practicing engineers who work in the area of interactive speech systems.
Intelligent environments represent an emerging topic in research. Next Generation Intelligent Environments: Ambient Adaptive Systems will cover all key topics in the field of intelligent ambient adaptive systems. It focuses on the results worked out within the framework of the ATRACO (Adaptive and TRusted Ambient eCOlogies) project. The theoretical background, the developed prototypes, and the evaluated results form a fertile ground useful for the broad intelligent environments scientific community as well as for industrial interest groups. Features of the book include: A unique and original collection of chapters on intelligent ambient adaptive systems Broad coverage of the field of intelligent environments research and evaluation, as well as topics such as adaptation within activity spheres Developed prototypes as examples for readers Computer scientists, engineers and others who work in the area of ambient environments will find the edition interesting and useful to their own work. In addition, graduate students and Ph.D. students specializing in the area of intelligent environments may also use this book to get a concrete idea of the major issues to consider when developing intelligent environments in practice.
In Monitoring Adaptive Spoken Dialog Systems, authors Alexander Schmitt and Wolfgang Minker investigate statistical approaches that allow for recognition of negative dialog patterns in Spoken Dialog Systems (SDS). The presented stochastic methods allow a flexible, portable and accurate use. Beginning with the foundations of machine learning and pattern recognition, this monograph examines how frequently users show negative emotions in spoken dialog systems and develop novel approaches to speech-based emotion recognition using hybrid approach to model emotions. The authors make use of statistical methods based on acoustic, linguistic and contextual features to examine the relationship between the interaction flow and the occurrence of emotions using non-acted recordings several thousand real users from commercial and non-commercial SDS. Additionally, the authors present novel statistical methods that spot problems within a dialog based on interaction patterns. The approaches enable future SDS to offer more natural and robust interactions. This work provides insights, lessons and inspiration for future research and development, not only for spoken dialog systems, but for data-driven approaches to human-machine interaction in general.
Bandwidth Extension of Speech Signals describes the theory and methods for quality enhancement of clean speech signals and distorted speech signals such as those that have undergone a band limitation, for instance, in a telephone network. Problems and the respective solutions are discussed for the different approaches. The different approaches are evaluated and a real-time implementation of the most promising approach is presented. The book includes topics related to speech coding, pattern- / speech recognition, speech enhancement, statistics and digital signal processing in general.
Stochastically-Based Semantic Analysis investigates the problem of automatic natural language understanding in a spoken language dialog system. The focus is on the design of a stochastic parser and its evaluation with respect to a conventional rule-based method. Stochastically-Based Semantic Analysis will be of most interest to researchers in artificial intelligence, especially those in natural language processing, computational linguistics, and speech recognition. It will also appeal to practicing engineers who work in the area of interactive speech systems.
Adaptive Multimodal Interactive Systems introduces a general framework for adapting multimodal interactive systems and comprises a detailed discussion of each of the steps required for adaptation. This book also investigates how interactive systems may be improved in terms of usability and user friendliness while describing the exhaustive user tests employed to evaluate the presented approaches. After introducing general theory, a generic approach for user modeling in interactive systems is presented, ranging from an observation of basic events to a description of higher-level user behavior. Adaptations are presented as a set of patterns similar to those known from software or usability engineering.These patterns describe recurring problems and present proven solutions. The authors include a discussion on when and how to employ patterns and provide guidance to the system designer who wants to add adaptivity to interactive systems. In addition to these patterns, the book introduces an adaptation framework, which exhibits an abstraction layer using Semantic Web technology.Adaptations are implemented on top of this abstraction layer by creating a semantic representation of the adaptation patterns. The patterns cover both graphical interfaces as well as speech-based and multimodal interactive systems.
Speech and Human-Machine Dialog focuses on the dialog management component of a spoken language dialog system. Spoken language dialog systems provide a natural interface between humans and computers. These systems are of special interest for interactive applications, and they integrate several technologies including speech recognition, natural language understanding, dialog management and speech synthesis. Due to the conjunction of several factors throughout the past few years, humans are significantly changing their behavior vis-à-vis machines. In particular, the use of speech technologies will become normal in the professional domain, and in everyday life. The performance of speech recognition components has also significantly improved. This book includes various examples that illustrate the different functionalities of the dialog model in a representative application for train travel information retrieval (train time tables, prices and ticket reservation). Speech and Human-Machine Dialog is designed for a professional audience, composed of researchers and practitioners in industry. This book is also suitable as a secondary text for graduate-level students in computer science and engineering.
This book provides a detailed and up-to-date overview on classification and data mining methods. The first part is focused on supervised classification algorithms and their applications, including recent research on the combination of classifiers. The second part deals with unsupervised data mining and knowledge discovery, with special attention to text mining. Discovering the underlying structure on a data set has been a key research topic associated to unsupervised techniques with multiple applications and challenges, from web-content mining to the inference of cancer subtypes in genomic microarray data. Among those, the book focuses on a new application for dialog systems which can be thereby made adaptable and portable to different domains. Clustering evaluation metrics and new approaches, such as the ensembles of clustering algorithms, are also described.
Current speech recognition systems are based on speaker independent speech models and suffer from inter-speaker variations in speech signal characteristics. This work develops an integrated approach for speech and speaker recognition in order to gain space for self-learning opportunities of the system. This work introduces a reliable speaker identification which enables the speech recognizer to create robust speaker dependent models In addition, this book gives a new approach to solve the reverse problem, how to improve speech recognition if speakers can be recognized. The speaker identification enables the speaker adaptation to adapt to different speakers which results in an optimal long-term adaptation.
This book addresses the problem of separating spontaneous multi-party speech by way of microphone arrays (beamformers) and adaptive signal processing techniques. It is written is a concise manner and an effort has been made such that all presented algorithms can be straightforwardly implemented by the reader. All experimental results have been obtained with real in-car microphone recordings involving simultaneous speech of the driver and the co-driver.
Proactive Spoken Dialogue Interaction in Multi-Party Environments describes spoken dialogue systems that act as independent dialogue partners in the conversation with and between users. The resulting novel characteristics such as proactiveness and multi-party capabilities pose new challenges on the dialogue management component of such a system and require the use and administration of an extensive dialogue history. In order to assist the proactive spoken dialogue systems development, a comprehensive data collection seems mandatory and may be performed in a Wizard-of-Oz environment. Such an environment builds also the appropriate basis for an extensive usability and acceptance evaluation. Proactive Spoken Dialogue Interaction in Multi-Party Environments is a useful reference for students and researchers in speech processing.
Speech Recognition for Mobile Phones considers practical aspects for the development and deployment of client-server speech-enabled information systems. The authors discuss different paradigms for speech recognition for mobile devices. The strengths of the DSR technique are be demonstrated. Platforms that have proven to be suitable for the implementation of acoustic front-ends on cellular phones (including Java Micro Edition (J2ME) and Symbian C++) are analyzed. In addition an introduction to corresponding integrated development environments such as Eclipse and Carbide is provided. The authors study issues related to an efficient data transmission over GSM and 3G networks, compare data transmission systems based on TCP/IP and UDP/IP protocols, and highlight their advantages and drawbacks. Finally, the use of DSR technology for practical applications, such as speech-enabled public transportation information access system for mobile phones, are demonstrated. This book provides the reader with plug-in ready solutions for the deployment of distributed speech recognition (DSR) systems on conventional mobile appliances that operate on existing network infrastructures.
This volume consists of papers presented at the Second International Conference on Algebraic and Logic Programming in Nancy, France, October 1-3, 1990.
Since both the coments and the structure of the book appeared to be successful, only minor changes were made. In particular, some recent work in ATP has been incorporated so that the book continues to reflect the state of the art in the field. The most significant change is in the quality of the layout including the removal of a number of inaccuracies and typing errors. R. Caferra, E. Eder, F. van der Linden, and J. Muller have caught vanous minor errors. P. Haddawy and S.T. Pope have provided many stilistic improvements of the English text. Last not least, A. Bentrup and W. Fischer have produced the beautiful layout. The extensive work of typesetting was financally supported within ESPRIT pro ject 415. Munchen, September 1986 W. Bibel PREFACE Among the dreams of mankind is the one dealing with the mechanization of human thought. As the world today has become so complex that humans apparently fail to manage it properly with their intellectual gifts, the realization of this dream might be regarded even as something like a necessity. On the other hand, the incredi ble advances in computer technology let it appear as a real possibility.
Thank you for visiting our website. Would you like to provide feedback on how we could improve your experience?
This site does not use any third party cookies with one exception — it uses cookies from Google to deliver its services and to analyze traffic.Learn More.