Beware of moving tagerts: reference proteome content fluctuates substantially over the years

Reference proteomes are generated by increasingly sophisticated annotation pipelines as part of regular genome build releases; yet, the corresponding changes in reference proteomes' content are dramatic. In the history of the NCBI-curated human proteome, the total number of entries has remained...

Full description

Saved in:
Bibliographic Details
Main Authors: Sirota, Fernanda L., Batagov, Arsen, Schneider, Georg, Eisenhaber, Birgit, Eisenhaber, Frank, Maurer-Stroh, Sebastian
Other Authors: School of Computer Engineering
Format: Article
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/103840
http://hdl.handle.net/10220/17113
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-103840
record_format dspace
spelling sg-ntu-dr.10356-1038402020-05-28T07:17:19Z Beware of moving tagerts: reference proteome content fluctuates substantially over the years Sirota, Fernanda L. Batagov, Arsen Schneider, Georg Eisenhaber, Birgit Eisenhaber, Frank Maurer-Stroh, Sebastian School of Computer Engineering School of Biological Sciences DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Reference proteomes are generated by increasingly sophisticated annotation pipelines as part of regular genome build releases; yet, the corresponding changes in reference proteomes' content are dramatic. In the history of the NCBI-curated human proteome, the total number of entries has remained roughly constant but approximately half of the proteins from the 2003 build 33 are no longer represented by entries in current releases, while about the same number of new proteins have been added (for sequence identity thresholds 50–90%). Although mostly hypothetical proteins are affected, there are also spectacular cases of entry removal/addition of well studied proteins. The changes between the 2003 and recent human proteomes are in a similar order of magnitude as the differences between recent human and chimpanzee proteome releases. As an application example, we show that the proteome fluctuations affect the interpretation (about 74% of hits) of organelle-specific mass-spectrometry data. Although proteome quality tends to improve with more recent releases as, for example, the fraction of proteins with functional annotation has increased over time, existing evidence implies that, apparently, the proteome content still remains incomplete, not just pertaining to isoforms/sequence variants but also to proteins and their families that are clearly distinct. 2013-10-31T03:25:46Z 2019-12-06T21:21:25Z 2013-10-31T03:25:46Z 2019-12-06T21:21:25Z 2012 2012 Journal Article Sirota, F. L., Batagov, A., Schneider, G., Eisenhaber, B., Eisenhaber, F., & Maurer-Stroh, S. (2012). Beware of moving tagerts: reference proteome content fluctuates substantially over the years. Journal of bioinformatics and computational biology, 10(06), 1250020-. https://hdl.handle.net/10356/103840 http://hdl.handle.net/10220/17113 10.1142/S0219720012500205 en Journal of bioinformatics and computational biology
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
spellingShingle DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
Sirota, Fernanda L.
Batagov, Arsen
Schneider, Georg
Eisenhaber, Birgit
Eisenhaber, Frank
Maurer-Stroh, Sebastian
Beware of moving tagerts: reference proteome content fluctuates substantially over the years
description Reference proteomes are generated by increasingly sophisticated annotation pipelines as part of regular genome build releases; yet, the corresponding changes in reference proteomes' content are dramatic. In the history of the NCBI-curated human proteome, the total number of entries has remained roughly constant but approximately half of the proteins from the 2003 build 33 are no longer represented by entries in current releases, while about the same number of new proteins have been added (for sequence identity thresholds 50–90%). Although mostly hypothetical proteins are affected, there are also spectacular cases of entry removal/addition of well studied proteins. The changes between the 2003 and recent human proteomes are in a similar order of magnitude as the differences between recent human and chimpanzee proteome releases. As an application example, we show that the proteome fluctuations affect the interpretation (about 74% of hits) of organelle-specific mass-spectrometry data. Although proteome quality tends to improve with more recent releases as, for example, the fraction of proteins with functional annotation has increased over time, existing evidence implies that, apparently, the proteome content still remains incomplete, not just pertaining to isoforms/sequence variants but also to proteins and their families that are clearly distinct.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Sirota, Fernanda L.
Batagov, Arsen
Schneider, Georg
Eisenhaber, Birgit
Eisenhaber, Frank
Maurer-Stroh, Sebastian
format Article
author Sirota, Fernanda L.
Batagov, Arsen
Schneider, Georg
Eisenhaber, Birgit
Eisenhaber, Frank
Maurer-Stroh, Sebastian
author_sort Sirota, Fernanda L.
title Beware of moving tagerts: reference proteome content fluctuates substantially over the years
title_short Beware of moving tagerts: reference proteome content fluctuates substantially over the years
title_full Beware of moving tagerts: reference proteome content fluctuates substantially over the years
title_fullStr Beware of moving tagerts: reference proteome content fluctuates substantially over the years
title_full_unstemmed Beware of moving tagerts: reference proteome content fluctuates substantially over the years
title_sort beware of moving tagerts: reference proteome content fluctuates substantially over the years
publishDate 2013
url https://hdl.handle.net/10356/103840
http://hdl.handle.net/10220/17113
_version_ 1681058433633091584