An improving method for estimating amino acid replacement models

Amino acid replacement models (amino acid substitution models or ma-trices) play important roles in protein phylogenetics analysis and protein sequence alignment. Dayhoff was the fi rst person who proposed a method to build amino acid models in 1972. Currently, maximum likelihood (ML) methods ar...

全面介紹

Saved in:
書目詳細資料
主要作者: Lê, Văn Đạt
格式: Theses and Dissertations
語言:other
出版: Đại học Quốc gia Hà Nội 2016
主題:
在線閱讀:http://repository.vnu.edu.vn/handle/VNU_123/8266
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:Amino acid replacement models (amino acid substitution models or ma-trices) play important roles in protein phylogenetics analysis and protein sequence alignment. Dayhoff was the fi rst person who proposed a method to build amino acid models in 1972. Currently, maximum likelihood (ML) methods are widely used to estimate popular models such as WAG, LG, FLU, etc. However, ML methods are slow and not applicable to large datasets. The most time consuming step in estimating matrices is build-ingphylogenetics trees from protein alignments. In this thesis, we propose new methods to overcome the obstacle by splitting large alignments into small ones which still contain enough evolutionary information for esti-mating matrices. Experiments with both Pfam and FLU data sets show that proposed meth-ods are about three to nine times faster than the best current method while the quality of estimated matrices are nearly the same. Thus, our methods will enable researchers to estimate matrices from very large datasets.