Data collection & curation
To collect software for the prediction of peptide activities, we
searched PubMed and Google Scholar databases using queries such as
“antimicrobial peptide prediction”, “anticancer peptide prediction”,
etc. We focused on the search results that described software or
reviewed/compared available tools for the prediction of a given
activity.
Eligibility criteria:
- the tool needs to be published in a peer-review journal until 1st
July 2022.
- we include models that are superceded by their next version.
We manually curated all information for software featured in a
‘Software information’ tab. To do that we carefully analysed
publications describing given software looking for the following
information:
- peptide activities predicted by a model,
- link to the web server (Web server column),
- link to the model, i.e. a repository with a trained model which can
be used for prediction or an address of the model implemented as
standalone software (Model repository column),
- link to the repository with all the code and data necessary to
train/retrain the model (Training repository
column).
Then, we checked if the links to the web servers provided in the
articles were still working and if they function correctly, i.e. provide
understandable output after running prediction. This information is
indicated in the Web server activity and Web
server functionality columns, respectively.
The availability of web servers was assessed on October 14th, 2022.
The year of publication and the number of citations were obtained from
CrossRef on October 14th, 2022.
We also inspected available code to determine the reproducibility
standard (adapted from Heil et al. and
indicated in the Reproducibility standard column). We
are additionally using the category below bronze, when a model
does not fulfill criteria even for the bronze category.
About and citation
This website accompanies our publication: The dynamic landscape
of peptide function prediction.
Citation: Oriol Bárcenas, Carlos Pintado-Grima,
Katarzyna Sidorczuk, Felix Teufel, Henrik Nielsen, Salvador Ventura and
Michał Burdukiewicz, The dynamic landscape of peptide function
prediction, Computational and Structural Biotechnology Journal, 10.1016/j.csbj.2022.11.043.
Authors
Oriol Bárcenas
Oriol Bárcenas is an undergraduate bioinformatics researcher at the
Institute of Biotechnology and Biomedicine at the Autonomous University
of Barcelona (UAB). He is a Biotechnology B.Sc. graduate from UAB (2022)
and has joined a Mathematical Modelling and Data Science M.Sc. He will
follow his career by enrolling in the joint Bioinformatics Ph.D. program
at UAB. His research will focus on the analysis of protein folding and
aggregation data, as well as in silico protein design.
Twitter: https://twitter.com/oriolbarcenas
Michał Burdukiewicz
Michał Burdukiewicz is currently working as a post-doc at the
Institute of Biotechnology and Biomedicine at the Autonomous University
of Barcelona and a research assistant in the Centre for Clinical
Research at the Medical University of Białystok. His research interests
cover machine learning applications in the functional analysis of
peptides and proteins, focusing on amyloids. Moreover, he is
co-developing tools for proteomics, mainly hydrogen-deuterium exchange
monitored by mass spectrometry.
Contact: michalburdukiewicz[at]gmail.com
Twitter: https://twitter.com/burdukiewicz
Website: https://github.com/michbur
Henrik Nielsen
Henrik Nielsen is PhD in Biochemistry and an associate professor at
the Technical University of Denmark. His research uses machine learning
to predict the subcellular localization of proteins. Henrik’s findings
are available through his tools, such as SignalP
or TargetP
or DeepLoc.
Website: https://www.healthtech.dtu.dk/protein-sorting
Carlos Pintado-Grima
Carlos Pintado-Grima is a PhD student in Bioinformatics at the
Institute of Biotechnology and Biomedicine at the Autonomous University
of Barcelona (UAB). He obtained his degree in Biology and the Bachelor
of Science at UAB and Thompson Rivers University (Kamloops, BC, Canada).
He recieved his M.Sc. in Bioinformatics in 2020 at UAB. His current
research is focused on the development and analysis of bioinformatics
tools to better understand protein aggregation, folding and
misfolding.
Twitter: https://twitter.com/cpintadogrima
Katarzyna Sidorczuk
Katarzyna Sidorczuk received the M.Sc. degree in biotechnology from
the University of Wrocław, Poland, in 2019. She is currently pursuing
the Ph.D. degree in biological sciences at the University of Wrocław.
Her research focuses on bioinformatics and machine learning approaches
for the analysis and prediction of peptide functions, protein targeting
sequences and bacterial adhesins.
Twitter: https://twitter.com/k_sidorczuk
Felix Teufel
Felix Teufel is a PhD student in Machine Learning at the University
of Copenhagen. He obtained his MSc in Biotechnology from ETH Zürich in
2021. His current research interests are understanding peptide function
using structural methods, representation learning in biology and protein
localization prediction.
Website: https://fteufel.github.io/