Beware of moving tagerts: reference proteome content fluctuates substantially over the years
Reference proteomes are generated by increasingly sophisticated annotation pipelines as part of regular genome build releases; yet, the corresponding changes in reference proteomes' content are dramatic. In the history of the NCBI-curated human proteome, the total number of entries has remained...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/103840 http://hdl.handle.net/10220/17113 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-103840 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1038402020-05-28T07:17:19Z Beware of moving tagerts: reference proteome content fluctuates substantially over the years Sirota, Fernanda L. Batagov, Arsen Schneider, Georg Eisenhaber, Birgit Eisenhaber, Frank Maurer-Stroh, Sebastian School of Computer Engineering School of Biological Sciences DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Reference proteomes are generated by increasingly sophisticated annotation pipelines as part of regular genome build releases; yet, the corresponding changes in reference proteomes' content are dramatic. In the history of the NCBI-curated human proteome, the total number of entries has remained roughly constant but approximately half of the proteins from the 2003 build 33 are no longer represented by entries in current releases, while about the same number of new proteins have been added (for sequence identity thresholds 50–90%). Although mostly hypothetical proteins are affected, there are also spectacular cases of entry removal/addition of well studied proteins. The changes between the 2003 and recent human proteomes are in a similar order of magnitude as the differences between recent human and chimpanzee proteome releases. As an application example, we show that the proteome fluctuations affect the interpretation (about 74% of hits) of organelle-specific mass-spectrometry data. Although proteome quality tends to improve with more recent releases as, for example, the fraction of proteins with functional annotation has increased over time, existing evidence implies that, apparently, the proteome content still remains incomplete, not just pertaining to isoforms/sequence variants but also to proteins and their families that are clearly distinct. 2013-10-31T03:25:46Z 2019-12-06T21:21:25Z 2013-10-31T03:25:46Z 2019-12-06T21:21:25Z 2012 2012 Journal Article Sirota, F. L., Batagov, A., Schneider, G., Eisenhaber, B., Eisenhaber, F., & Maurer-Stroh, S. (2012). Beware of moving tagerts: reference proteome content fluctuates substantially over the years. Journal of bioinformatics and computational biology, 10(06), 1250020-. https://hdl.handle.net/10356/103840 http://hdl.handle.net/10220/17113 10.1142/S0219720012500205 en Journal of bioinformatics and computational biology |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Sirota, Fernanda L. Batagov, Arsen Schneider, Georg Eisenhaber, Birgit Eisenhaber, Frank Maurer-Stroh, Sebastian Beware of moving tagerts: reference proteome content fluctuates substantially over the years |
description |
Reference proteomes are generated by increasingly sophisticated annotation pipelines as part of regular genome build releases; yet, the corresponding changes in reference proteomes' content are dramatic. In the history of the NCBI-curated human proteome, the total number of entries has remained roughly constant but approximately half of the proteins from the 2003 build 33 are no longer represented by entries in current releases, while about the same number of new proteins have been added (for sequence identity thresholds 50–90%). Although mostly hypothetical proteins are affected, there are also spectacular cases of entry removal/addition of well studied proteins. The changes between the 2003 and recent human proteomes are in a similar order of magnitude as the differences between recent human and chimpanzee proteome releases. As an application example, we show that the proteome fluctuations affect the interpretation (about 74% of hits) of organelle-specific mass-spectrometry data. Although proteome quality tends to improve with more recent releases as, for example, the fraction of proteins with functional annotation has increased over time, existing evidence implies that, apparently, the proteome content still remains incomplete, not just pertaining to isoforms/sequence variants but also to proteins and their families that are clearly distinct. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Sirota, Fernanda L. Batagov, Arsen Schneider, Georg Eisenhaber, Birgit Eisenhaber, Frank Maurer-Stroh, Sebastian |
format |
Article |
author |
Sirota, Fernanda L. Batagov, Arsen Schneider, Georg Eisenhaber, Birgit Eisenhaber, Frank Maurer-Stroh, Sebastian |
author_sort |
Sirota, Fernanda L. |
title |
Beware of moving tagerts: reference proteome content fluctuates substantially over the years |
title_short |
Beware of moving tagerts: reference proteome content fluctuates substantially over the years |
title_full |
Beware of moving tagerts: reference proteome content fluctuates substantially over the years |
title_fullStr |
Beware of moving tagerts: reference proteome content fluctuates substantially over the years |
title_full_unstemmed |
Beware of moving tagerts: reference proteome content fluctuates substantially over the years |
title_sort |
beware of moving tagerts: reference proteome content fluctuates substantially over the years |
publishDate |
2013 |
url |
https://hdl.handle.net/10356/103840 http://hdl.handle.net/10220/17113 |
_version_ |
1681058433633091584 |