morgan@unix.SRI.COM (Morgan Kaufmann) (11/02/90)
Morgan Kaufmann Publishers announces the publication of a new title in our
series of "Readings" books:
READINGS IN SPEECH RECOGNITION
edited by Alex Waibel and Kai-Fu Lee
(Carnegie Mellon Univ.)
After two decades of considerable activity, speech recognition is beginning
to show promise as a practical technology and interest in the field is
growing dramatically. READINGS IN SPEECH RECOGNITION provides a collection
of key, seminal papers that have influenced or redirected the field and that
illustrate the central insights that have emerged over the years.
The editors provide an introduction to the field, its concerns and research
problems. Subsequent chapters are devoted to the main schools of thought
and design philosophies that have motivated different approaches to speech
recognition system design. Each chapter includes an introduction to the
papers that highlights the major insights or needs that have motivated an
approach to a problem and describes the commonalities and differences of
that approach to others in the book.
ISBN: 1-55860-124-4
Price: $38.95
629 pages, softbound
TABLE OF CONTENTS
Chapter 1 Why Study Speech Recognition? 1
Introduction 1
Dimensions of Difficulty in Speech Recognition 2
The Chapters of this Book 3
Further Study 4
References 5
Chapter 2 Problems and Opportunities 7
Introduction 7
2.1 Speech Recognition by Machine: A Review 8
D. R. Reddy
2.2 The Value of Speech Recognition Systems 39
W. A. Lea
Chapter 3 Speech Analysis 47
Introduction 47
References 48
3.1 Digital Representations of Speech Signals 49
R. W. Schafer and L. R. Rabiner
3.2 Comparison of Parametric Representations for Monosyllabic
Word Recognition in Continuously Spoken Sentences 65
S. B. Davis and P. Mermelstein
3.3 Vector Quantization 75
R. M. Gray
3.4 A Joint Synchrony/Mean-Rate Model of Auditory Speech
Processing 101
S. Seneff
Chapter 4 Template-Based Approaches 113
Introduction 113
References 114
4.1 Isolated and Connected Word Recognition Theory and Selected
Applications 115
L. R. Rabiner and S. E. Levinson
4.2 Minimum Prediction Residual Principle Applied to Speech
Recognition 154
F. Itakura
4.3 Dynamic Programming Algorithm Optimization for Spoken
Word Recognition 159
H. Sakoe and S. Chiba
4.4 Speaker-Independent Recognition of Isolated Words
Using Clustering Techniques 166
L. R. Rabiner, S. E. Levinson, A. E. Rosenberg, and J. G.
Wilpon
4.5 Two-Level DP-Matching-A Dynamic Programming-Based
Pattern Matching Algorithm for Connected Word Recognition 180
H. Sakoe
4.6 The Use of a One-Stage Dynamic Programming
Algorithm for Connected Word Recognition 188
H. Ney
Chapter 5 Knowledge-Based Approaches 197
Introduction 197
References 198
5.1 The Use of Speech Knowledge in Automatic Speech Recognition
200
V. W. Zue
5.2 Performing Fine Phonetic Distinctions: Templates versus
Features 214
R. A. Cole, R. M. Stern, and M. J. Lasry
5.3 Recognition of Speaker-Dependent Continuous Speech with KEAL
225
G. Mercier, D. Bigorgne, L. Miclet, L. Le Guennec,
and M. Querre
5.4 The Hearsay-II Speech Understanding System: A Tutorial 235
L. D. Erman and V. R. Lesser
5.5 Learning and Plan Refinement in a Knowledge-Based
System for Automatic Speech Recognition 246
R. De Mori, L. Lam, and M. Gilloux
Chapter 6 Stochastic Approaches 263
Introduction 263
References 265
6.1 A Tutorial on Hidden Markov Models and
Selected Applications in Speech Recognition 267
L. R. Rabiner
6.2 Stochastic Modeling for Automatic Speech Understanding 297
J. K. Baker
6.3 A Maximum Likeihood Approach to Continuous Speech Recognition 308
L. R. Bahl, F. Jelinek, and R. L. Mercer
6.4 High Performance Connected Digit Recognition
Using Hidden Markov Models 320
L. R. Rabiner, J. G. Wilpon, and F. K. Soong
6.5 Speech Recognition With Continuous-Parameter
Hidden Markov Models 332
L. R. Bahl, P. F. Brown, P. V. de Souza, and R. L. Mercer
6.6 Semi-Continuous Hidden Markov Models for Speech Signals 340
X. D. Huang and M. A. Jack
6.7 Context-Dependent Phonetic Hidden Markov Models
for Speaker-Independent Continuous Speech Recognition 347
K-F. Lee
6.8 A Stochastic Segment Model for Phoneme-Based
Continuous Speech Recognition 367
S. Roucos and M. O. Dunham
Chapter 7 Connectionist Approaches 371
Introduction 371
References 372
7.1 Review of Neural Networks for Speech Recognition 374
R. P. Lippmann
7.2 Phoneme Recognition Using Time-Delay Neural Networks 393
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K. J. Lang
7.3 Consonant Recognition by Modular Construction of
Large Phonemic Time-Delay Neural Networks 405
A. Waibel, H. Sawai, and K. Shikano
7.4 Learned Phonetic Discrimination Using Connectionist Networks 409
R. L. Watrous, L. Shastri, and A. H. Waibel
7.5 The ``Neural'' Phonetic Typewriter 413
T. Kohonen
7.6 Shift-Tolerant LVQ and Hybrid LVQ-HMM for Phoneme Recognition 425
E. McDermott, H. Iwamida, S. Katagiri, and Y. Tohkura
7.7 Speaker-Independent Word Recognition Using Dynamic Programming
Neural Networks 439
H. Sakoe, R. Isotani, K. Yoshida, K. Iso, and T. Watanabe
7.8 Speaker-Independent Word Recognition Using a
Neural Prediction Model 443
K. Iso and T. Watanabe
Chapter 8 Language Processing for Speech Recognition 447
Introduction 447
References 449
8.1 Self-Organized Language Modeling for Speech Recognition 450
F. Jelinek
8.2 A Tree-Based Statistical Language Model for Natural
Language Speech Recognition 507
L. R. Bahl, P. F. Brown, P. V. de Souza, and R. L. Mercer
8.3 Modification of Earley\'s Algorithm for Speech Recognition 515
A. Paeseler
8.4 Language Processing for Speech Understanding 519
W. A. Woods
8.5 Prosodic Knowledge Sources for Word Hypothesization
in a Continuous Speech Recognition System 534
A. Waibel
8.6 High Level Knowledge Sources in Usable Speech
Recognition Systems 538
S. R. Young, A. G. Hauptmann, W. H. Ward, E. T. Smith,
and P. Werner
Chapter 9 Systems 551
Introduction 551
References 552
9.1 Review of the ARPA Speech Understanding Project 554
D. H. Klatt
9.2 The Harpy Speech Understanding System 576
B. Lowerre
9.3 The Development of an Experimental Discrete Dictation Recognizer
587
F. Jelinek
9.4 BYBLOS: The BBN Continuous Speech Recognition System 596
Y. L. Chow, M. O. Dunham, O. A. Kimball, M. A. Krasner,
G. F. Kubala, J. Makhoul, P. J. Price, S. Roucos,
and R. M. Schwartz
9.5 An Overview of the SPHINX Speech Recognition System 600
K-F. Lee, H-W. Hon, and R. Reddy
9.6 ATR HMM-LR Continuous Speech Recognition System 611
T. Hanazawa, K. Kita, S. Nakamura, T. Kawabata, and K. Shikano
9.7 A Word Hypothesizer for a Large Vocabulary Continuous
Speech Understanding System 615
L. Fissore, P. Laface, G. Micca, and R. Pieraccini
Index 619
Credits 627
_________________________________________________________________
Ordering Information:
Please add $3.50 for the first book and $2.50 for each additional for
surface shipping to the U.S. and Canada; $6.50 for the first book and
$3.50 for each additional for shipping to all other areas.
Master Card, Visa and personal checks drawn on US banks
accepted.
California residents please add appropriate sales tax.
Morgan Kaufmann Publishers
Department WP
2929 Campus Drive, Suite 260
San Mateo, CA 94403
USA
Phone: (800)745-READ US and Canada, (415) 578-9928 elsewhere
Fax: (415) 578-0672