Ngram language models and smoothing techniques

[Antoniol et al., 1994]
G. Antoniol, F. Brugnara, M. Cettolo, and M. Federico. Language model estimations and representations for real-time continuous speech recognition. In ICSLP, pages 859-862, Yokohama, 1994.

[Bahl et al., 1989]
L. Bahl, P. Brown, P. de Souza, and R. Mercer. A tree-based statistical language model for natural language speech recognition. IEEE Transactions on Acoustic, Speech and Signal Processing, 37(7):1001-1008, 1989.

[Bordel et al., 1994]
G. Bordel, I. Torres, and E. Vidal. Back-off smoothing in a syntactic approach to language modelling. In ICSLP, pages xx-yy, Yokohama, 1994.

[Bordel et al., 1995]
G. Bordel, I. Torres, and E. Vidal. Qwi: A method for improved smoothing in language modelling. In International Conference on Acoustic, Speech and Signal Processing, pages xx-yy, 1995.

[Della Pietra et al., 1994]
S. Della Pietra, V. Della Pietra, J. Gillett, J. Lafferty, H. Printz, and L. Ure v s. Inference and estimation of a long-range Trigram model. In Grammatical Inference and Applications, ICGI'94, number 862 in Lecture Notes in Artificial Intelligence, pages 78-92. Springer Verlag, 1994.

[Dupont and Rosenfeld, 1997]
P. Dupont and R. Rosenfeld. Lattice based language models. Technical Report CMU-CS-97-173, Carnegie Mellon University, 1997.

[Dupont, 1995]
P. Dupont. Interpolated word and class bigram models for spanish conversational speech recognition. In IEEE on Automatic Speech Recognition Workshop, pages 121-122, Snowbird, USA, 1995.

[El-Bèze, 1993]
M. El-Bèze. Les Modèles de Langage Probabilistes : Quelques Domaines d'Application. Habilitation à diriger des recherches, LIPN: Université Paris-Nord, 1993.

[Essen and Steinbiss, 1992]
U. Essen and V. Steinbiss. Coocurrence smoothing for stochastic language modeling. In International Conference on Acoustic, Speech and Signal Processing, pages 161-164, 1992.

[Gupta et al., 1992]
V. Gupta, M. Lennig, and P. Mermelstein. A language model for very large-vocabulary speech recognition. Computer Speech and Language, 6:331-344, 1992.

[Jelinek, 1991]
F. Jelinek. Self-organized language modeling for speech recognition. In A. Waibel and K.F. Lee, editors, Readings in Speech Recognition, pages 450-506. MOrgan Kaufmann, 1991.

[Jelinek, 1993]
F. Jelinek. Up from trigrams, the struggle for improved language models. In European Conference on Speech Communication and Technology, pages 1037-1040, Berlin, 1993.

[Katz, 1987]
S.M. Katz. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Transactions on Acoustic, Speech and Signal Processing, 35(3):400-401, 1987.

[Kneser and Ney, 1995]
R. Kneser and H. Ney. Improved backing-off for M-gram language modeling. In International Conference on Acoustic, Speech and Signal Processing, pages 181-184, 1995.

[Kuhn and De Mori, 1990]
R. Kuhn and R. De Mori. A cache-based natural language model for speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(6):570-583, 1990.

[Ney and Essen, 1993]
H. Ney and U. Essen. Estimating `small' probabilities by leaving-one-out. In European Conference on Speech Communication and Technology, pages 2239-2242, Berlin, 1993.

[Ney et al., 1994]
H. Ney, U. Essen, and R. Kneser. On structuring probabilistic dependences in stochastic language modelling. Computer Speech and Language, 8:1-38, 1994.

pdupont@info.ucl.ac.be