Moderated Statistical Tests for Digital Gene Expresssion Technologies
by Gordon Smyth
Abstract: Digital gene expression (DGE) technologies measure
gene expression by counting sequence tags. They are sensitive technologies
for measuring gene expression on a genomic scale, without
the need for prior knowledge of the genome sequence. As the cost of
sequencing DNA decreases, the number of DGE datasets is expected
to grow dramatically.
Various tests of differential expression have been proposed for
replicated DGE data using binomial, Poisson, negative binomial or
pseudo-likelihood (PL) models for the counts, but none of the these
are usable when the number of replicates is very small.
We develop tests using the negative binomial distribution
to model overdispersion relative to the Poisson, and use conditional
weighted likelihood to moderate the level of overdispersion across
genes. A heuristic empirical Bayes algorithm is developed which is
applicable to very general likelihood estimation contexts.
Not only is our strategy applicable even with the smallest number
of replicates, but it also proves to be more powerful than previous
strategies when more replicates are available. The methodology is
applicable to other counting technologies, such as proteomic
For More Information: Dr Owen Jones O.D.Jones@ms.unimelb.edu.au