Ngram language models and smoothing techniques

[Antoniol et al., 1994]: G. Antoniol, F. Brugnara, M. Cettolo, and M. Federico. Language model estimations and representations for real-time continuous speech recognition. In ICSLP, pages 859-862, Yokohama, 1994.
[Bahl et al., 1989]: L. Bahl, P. Brown, P. de Souza, and R. Mercer. A tree-based statistical language model for natural language speech recognition. IEEE Transactions on Acoustic, Speech and Signal Processing, 37(7):1001-1008, 1989.
[Bordel et al., 1994]: G. Bordel, I. Torres, and E. Vidal. Back-off smoothing in a syntactic approach to language modelling. In ICSLP, pages xx-yy, Yokohama, 1994.
[Bordel et al., 1995]: G. Bordel, I. Torres, and E. Vidal. Qwi: A method for improved smoothing in language modelling. In International Conference on Acoustic, Speech and Signal Processing, pages xx-yy, 1995.
[Della Pietra et al., 1994]: S. Della Pietra, V. Della Pietra, J. Gillett, J. Lafferty, H. Printz, and L. Ure v s. Inference and estimation of a long-range Trigram model. In Grammatical Inference and Applications, ICGI'94, number 862 in Lecture Notes in Artificial Intelligence, pages 78-92. Springer Verlag, 1994.
[Dupont and Rosenfeld, 1997]: P. Dupont and R. Rosenfeld. Lattice based language models. Technical Report CMU-CS-97-173, Carnegie Mellon University, 1997.
[Dupont, 1995]: P. Dupont. Interpolated word and class bigram models for spanish conversational speech recognition. In IEEE on Automatic Speech Recognition Workshop, pages 121-122, Snowbird, USA, 1995.
[El-Bèze, 1993]: M. El-Bèze. Les Modèles de Langage Probabilistes : Quelques Domaines d'Application. Habilitation à diriger des recherches, LIPN: Université Paris-Nord, 1993.
[Essen and Steinbiss, 1992]: U. Essen and V. Steinbiss. Coocurrence smoothing for stochastic language modeling. In International Conference on Acoustic, Speech and Signal Processing, pages 161-164, 1992.
[Gupta et al., 1992]: V. Gupta, M. Lennig, and P. Mermelstein. A language model for very large-vocabulary speech recognition. Computer Speech and Language, 6:331-344, 1992.
[Jelinek, 1991]: F. Jelinek. Self-organized language modeling for speech recognition. In A. Waibel and K.F. Lee, editors, Readings in Speech Recognition, pages 450-506. MOrgan Kaufmann, 1991.
[Jelinek, 1993]: F. Jelinek. Up from trigrams, the struggle for improved language models. In European Conference on Speech Communication and Technology, pages 1037-1040, Berlin, 1993.
[Katz, 1987]: S.M. Katz. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustic, Speech and Signal Processing, 35(3):400-401, 1987.
[Kneser and Ney, 1995]: R. Kneser and H. Ney. Improved backing-off for M-gram language modeling. In International Conference on Acoustic, Speech and Signal Processing, pages 181-184, 1995.
[Kuhn and De Mori, 1990]: R. Kuhn and R. De Mori. A cache-based natural language model for speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(6):570-583, 1990.
[Ney and Essen, 1993]: H. Ney and U. Essen. Estimating `small' probabilities by leaving-one-out. In European Conference on Speech Communication and Technology, pages 2239-2242, Berlin, 1993.
[Ney et al., 1994]: H. Ney, U. Essen, and R. Kneser. On structuring probabilistic dependences in stochastic language modelling. Computer Speech and Language, 8:1-38, 1994.

pdupont@info.ucl.ac.be