AI Seminar
November 4, 2002

Oliver Hampton


Machine Learning in Bioinformatics: An Overview of Hidden Markov Models
and PCAPSS

Abstract:
The basics of Hidden Markov Models (HMMs) were laid bare in a series of
papers by L. E. Baum and colleagues around 1970.  HMMs are statistical
models whose predictions are based on probability.  HMMs were originally
used in speech recognition, but in the late 1980s became popular in the
fields of genetics and molecular biology.  Current applications of HMMs in
biology include, chromosome mapping, aligning biological sequences,
predicting protein structure, inferring evolutionary relationships, and
gene finding.  At New Mexico State, the Southwest Biotechnology
Informatics Center ( http://www.swbic.org ) utilizes HMMs in the PCAPSS
research project.  PCAPSS ( Protein Classification through the Assessment
of Predicted Secondary Structure) is a fold-recognition tool for helping
identify enzymatic function for protein sequences that have dissimilar
amino acid sequence, but similar overall structure.  From a single query
protein sequence, PCAPSS builds a hidden Markov model of predicted
secondary structure to search the PDB for proteins of similar structure.