An improving method for estimating amino acid replacement models

Amino acid replacement models (amino acid substitution models or ma-trices) play important roles in protein phylogenetics analysis and protein sequence alignment. Dayhoff was the fi rst person who proposed a method to build amino acid models in 1972. Currently, maximum likelihood (ML) methods ar...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Lê, Văn Đạt
التنسيق: Theses and Dissertations
اللغة:other
منشور في: Đại học Quốc gia Hà Nội 2016
الموضوعات:
الوصول للمادة أونلاين:http://repository.vnu.edu.vn/handle/VNU_123/8266
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Amino acid replacement models (amino acid substitution models or ma-trices) play important roles in protein phylogenetics analysis and protein sequence alignment. Dayhoff was the fi rst person who proposed a method to build amino acid models in 1972. Currently, maximum likelihood (ML) methods are widely used to estimate popular models such as WAG, LG, FLU, etc. However, ML methods are slow and not applicable to large datasets. The most time consuming step in estimating matrices is build-ingphylogenetics trees from protein alignments. In this thesis, we propose new methods to overcome the obstacle by splitting large alignments into small ones which still contain enough evolutionary information for esti-mating matrices. Experiments with both Pfam and FLU data sets show that proposed meth-ods are about three to nine times faster than the best current method while the quality of estimated matrices are nearly the same. Thus, our methods will enable researchers to estimate matrices from very large datasets.