Character N-gram Tokenization, Cross-Language Information Retrieval, Information Retrieval, Parallel Corpora, Text Processing, Text Retrieval, Computer Science (0984)
Cluster Sampling; Finite-Mixture and Dirichlet-Multinomial Distributions; Generalized Estimating Equations; Marginal and Conditional Models for Overdispersion; Overdispersion; Random Effects; Statistics (0463)