HONG KONG BAPTIST UNIVERSITY
FACULTY OF SCIENCE

Department of Computer Science Seminar
2017 Series

Analyze Microbial Communities to Aid Health and Ecology Studies Using Next-generation Sequencing Data

Dr. Yanni Sun
Associate Professor
Department of Computer Science and Engineering
Michigan State University US

Date: June 8, 2017 (Thursday)
Time: 5:00 - 6:00 pm
Venue: FSC703, Fong Shu Chuen Library, Ho Sin Hang Campus

Abstract
Known as the blueprint of life, the genomic sequence contains instructions for controlling a species’ growth, development, survival, and reproduction. Next-generation sequencing (NGS) technologies, which produce vast amount of sequencing data for various life forms, have provided tremendous information for tackling grand challenges from finding more effective treatment for human diseases to improving biofuel energy production. In particular, combined with modern molecular biology techniques, NGS allows scientists to sequence both culturable and unculturable microbes in human microbiota and natural environments at unprecedented depth and resolution (aka metageomic sequencing). However, in contrast to the rapid accumulation of the microbial community data, data analysis methods and tools that can take full advantage of this sequencing power seriously lag. Thus, there is a pressing need to convert the BIG NGS data into knowledge. 

In this talk, I will present our recent work on functional and composition analysis for microbial community data. In the first part of my talk, I will focus on functional analysis of microbial community data that are generated using the third-generation sequencing platforms such as Single Molecule, Real-Time (SMRT) Sequencing. Long reads have the potential to characterize complex microbial communities accurately. However, their current applications are hampered by the high sequencing error rate. I will introduce our recently developed algorithm that incorporates protein families and SMRT read alignment for error correction and sensitive homology search. By applying our algorithm to a human arm metagenomic data, we can clearly identify more protein homologs.

The second part of my talk will focus on characterization of the intra-host viral populations using NGS data. Many clinically important RNA viruses such as HIV, HCV, SARS-coV, Influenza have a high mutation rate during replication and thus form a population of related but different viral strains, which are referred to as quasispecies. Reconstruction of each strain sequence is highly important for development of clinic prevention and treatment. I will present our work on effective reconstruction of all haplotypes in quasispecies using NGS data. 

Biography
Yanni Sun is an Associate Professor in the Department of Computer Science and Engineering at Michigan State University, USA. She received the B.S. and M.S. degrees from Xi'an JiaoTong University (China), both in Computer Science. She received the Ph.D. degree in Computer Science from Washington University in Saint Louis, USA in 2008. She works in bioinformatics and computational biology. In particular, her recent research interests include sequence analysis, next-generation sequencing data analysis, metagenomics, protein domain annotation, and noncoding RNA annotation. She was a recipient of NSF CAREER Award in 2010.

********* ALL INTERESTED ARE WELCOME ***********
(For enquiry, please contact Computer Science Department at 3411 2385)

http://www.comp.hkbu.edu.hk/v1/?page=seminars&id=427&lang=sc
Photos  Slides