This project works characterise variation in the majority of known human genome repeats; to discover any differences between the tumour and normal sample, and primary tumour and metastatic sample.
There are approximately one million tandem repeats in the human genome, encompassing 10.6% of the entire genome. These TRs are prone to repeat number variation and are thought to comprise a major source of genetic variation between individuals.
However, genomic analysis of these regions has been particularly challenging using short-read sequencing approaches. As a result, very little is known about the role of repeat variation during tumorigenesis.
We have sequenced a matched tumour/metastatic sample from a breast cancer patient using nanopore long-read sequencing. The median read length of greater than 10kb allows for characterization of repeat expansions relative to the reference genome.
The goal of this project is to characterise variation in the majority of known repeats, and to discover any differences between the tumour and normal sample, and also between the primary tumour and the metastatic sample.
The candidate should possess strong bioinformatics/computational skills, including familiarity with use of Linux commands, and also some familiarity with one or more programming languages.