School Seminars and Colloquia

Using short read data in genetic studies to identify disease-causing mutations

Statistics Seminar

by Melanie Bahlo

Institution: The Walter and Eliza Hall Institute of Medical Research
Date: Tue 17th May 2011
Time: 1:00 PM
Location: Richard Berry Room 213

Abstract: Short read data is generated using massively parallel sequencing (MPS) technologies and consists of millions of short fragments of DNA, each consisting of approximately one hundred base pairs, or A, G, C and T’s. The short reads are randomly sheared bits of a person’s genomic sequence, or their DNA. This new technology is revolutionizing the field of statistical genetics as it allows orders of magnitude faster sequencing (assessment of a person’s genomic sequence) than was previously possible.

I will discuss how this data is used in studies to find the genetic cause of disease and show an example from our lab where we are trying to identify the cause of deafness in a family. Simple genetic rules of elimination of the thousands of variants identified in the MPS data have failed to identify the causal variant so far but due to the experimental design we can apply a statistical ranking procedure for this data, which is agnostic to some of the genetic filters. We use this to re-rank the list of variants.

I will also discuss another application of MPS in our lab, which was following up a genome wide association study (Suppiah 2009), showing how much QC went into producing the final set of test statistics.

