Computational Tools for Linked-read Sequencing Technology (Eric Zhang et al.)

The human genome holds the key for understanding the genetic basis of human evolution, hereditary illnesses and many phenotypes. Whole-genome reconstruction and variant discovery, accomplished by analysis of data from whole-genome sequencing experiments, are foundational for the study of human genomic variation and analysis of genotype-phenotype relationships. Over the past decades, cost-effective whole-genome sequencing has been revolutionized by short-fragment approaches, the most widespread of which have been the consistently improving generations of the original Solexa technology, now referred to as Illumina sequencing.

An alternative approach is offered by the 10x Genomics Chromium system, which distributes the DNA prep into millions of partitions where partition-specific barcode sequences are attached to short amplification products that are templated off the input fragments. But there lacks efficient software to handle this recently emergent technology and make full use of DNA long-range information. In this project, we propose to develop a series of computational tools to analyze 10x linked-reads data, including read alignment, de novo assembly, variant detection, metagenome binning et al. We believe our contribution can move us one step further to make precision medicine into reality.

Related Publications:

For further information on this research topic, please contact Dr. Eric Zhang.