[Faculty] Fwd: [CSRC-SDSU COLLOQUIUM]: Finding a novel way for fast sequence alignment and exploiting information theory in bacterial genomes and complete phages

Tue Aug 27 14:04:46 PDT 2013

DATE: Friday, August 30th, 2013

TITLE: Finding a novel way for fast sequence alignment and exploiting
information
theory in bacterial genomes and complete phages

TIME: 3:30 PM
LOCATION: GMCS 214

SPEAKER: Sajia Akhter. Computational Science Research Center at SDSU.

The invention of next generation sequencing technology (NGS) provides
the capability of generating high throughput low cost sequencing data,
and is used by scientists to address a diverse range of biological
problems. Several data analysis algorithms have been developed in last
few years to best exploit NGS data. New tools and methods have also
been implemented for better understanding of these data. This talk
presents several novel techniques involving NGS datasets. The first
technique, qudaich is a novel sequence aligner, which can be used as a
key part of NGS data analysis. Qudaich generates the pairwise local
alignments of a query dataset against a database. Qudaich can
efficiently process large volumes of data and is well suited to the
next generation reads datasets. This aligner can also handle both DNA
and protein sequences and tries to generate the best possible
alignment for each query sequence. In contrast to other contemporary
aligners, qudaich is more efficient in terms of execution time and
accuracy. Next, in this talk, I show different ways to extract useful
genomic information from NGS data, which, in turn, shows promising
directions to solve some of the existing biological problems like
prophage prediction. Prophages are viruses that integrated into, and
replicated as part of, the bacterial genome. These genetic elements
can have tremendous impact on their hosts. The majority of other phage
finding tools mainly rely on homology-based approach for prophage
prediction, which limits the de novo discovery of novel prophages.
This work also presents a novel algorithm, PhiSpy to predict prophages
in bacterial genomes. PhiSpy combines similarity based and composition
based strategies to identify prophages. It finds 94% of the known
prophages in 50 complete bacterial genomes with a 6% false negative
rate and a 0.66% false positive rate. This led to a successful
prediction of the largest set of prophages comparing to other prophage
finding applications. Finally, this work also demonstrates that
information theory can be effectively applied to find informative
sequences, to predict the lifestyle restrictions of an organism, and
to analyze the deviation of the amino acid utilization profile in
different metabolic processes in different organisms. Together, these
tools will enable the next generation of sequence analyses using next
generation sequence data.

HOST: Dr. Jose Castillo.

For future events, please visit our website at:

http://www.csrc.sdsu.edu/colloquium.html

-- 
Jose E. Castillo  Ph.D.

Director / Professor

Computational Science Research Center

5500 Campanile Dr

San Diego State University

San Diego CA 92182-1245

619 5947205/3430, Fax 619-594-2459

 http://www.csrc.sdsu.edu/mimetic-book/

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.