BioGenies
  • Home
  • Team
    • BioGenies team
    • BioGenies collaborators
    • Guest researchers
    • Former BioGenies members
    • About BioGenies
  • Our projects
    • OneTick
    • AMI‑CryoML
    • AmyloGraph 2.0
    • LIMAD
    • imputomics 2.0
    • FIBREA
  • Software
  • Seminars
  • Publications
  • Conferences etc.
  • Theses and dissertations
  1. The peptide prediction ecosystem is exploding 🀯🧬
  • Our topics
    • Amyloids
    • Liquid-liquid phase separation
    • Antimicrobial peptides
    • Missing value imputation
    • HDX-MS

../../

  • πŸ”— Explore the peptide prediction landscape
  • 🎧 Audio summary
  • πŸ”¬ What is this about?
  • πŸ“Š The scale of the problem
  • ⚠️ The core problem
  • 🧠 What they analyzed
    • πŸ“Š Trends in peptide ML
    • πŸ€– Deep learning explosion
    • 🧬 Most common activities
    • βš™οΈ Most common ML models
  • πŸ” Key insights
    • ⚠️ Reproducibility crisis 🚨
    • 🌐 Web servers matter more than reproducibility
    • πŸ€– Deep learning hype is real
  • πŸš€ Why this matters
    • 🧠 For peptide therapeutics
    • ⚠️ For bioinformatics
    • 🌍 For users
  • πŸ’š BioGenies perspective

The peptide prediction ecosystem is exploding 🀯🧬

publications
peptides
A large-scale review of peptide activity prediction tools reveals rapid growth, reproducibility issues and emerging trends in ML-based peptide bioinformatics.
Author

BioGenies Lab

Published

November 24, 2022

Keywords

peptides, machine learning, deep learning, bioinformatics, AMP, reproducibility, prediction tools


πŸ“Œ Project highlights

  • 🧬 Reviews 140 peptide prediction tools
  • πŸ€– Covers AMP, anticancer, antiviral & more
  • πŸ“Š Tracks rise of deep learning in peptide ML
  • ⚠️ Reveals major reproducibility crisis
  • 🌐 Provides curated peptide prediction resource database

πŸŽ‰ New paper out! This one tackles a very meta problem:

πŸ‘‰ there are now SO MANY peptide predictors that choosing one became a challenge itself πŸ˜„

πŸ‘‰ The dynamic landscape of peptide activity prediction


πŸ”— Explore the peptide prediction landscape

  • 🌐 Tool list: https://biogenies.info/peptide-prediction-list/

πŸ‘‰ one place to browse the chaotic peptide prediction universe 🌌


🎧 Audio summary

Antimicrobial predictors. Anticancer predictors. Antiviral predictors.
Deep learning everywhere. Broken web servers. Missing code πŸ˜…

πŸ‘‰ Here’s a short audio overview 🎧 explaining what is happening in peptide ML:

Your browser does not support the audio element.

πŸ‘‰ Perfect if you want the big picture of peptide prediction tools


πŸ”¬ What is this about?

Peptides can do a lot:

  • 🦠 antimicrobial activity
  • 🎯 anticancer effects
  • 🧠 blood–brain barrier penetration
  • πŸ”₯ anti-inflammatory activity
  • 🧬 anti-amyloid interactions

Because of that:

πŸ‘‰ researchers keep building ML predictors for peptide activity.

And the field exploded πŸš€


πŸ“Š The scale of the problem

This review identified:

  • 140 peptide prediction tools
  • published between 2009–2022

Activities included:

  • antimicrobial peptides (AMPs)
  • anticancer peptides
  • antiviral peptides
  • antifungal peptides
  • cell-penetrating peptides
  • blood–brain barrier peptides

…and many more πŸ˜…


⚠️ The core problem

Too many tools.

Too many datasets.

Too many inconsistent definitions.


Example:

πŸ‘‰ β€œAMP” sometimes means:

  • antibacterial only ❌
  • all antimicrobial peptides βœ…

depending on the paper


And this creates:

  • benchmark bias
  • incompatible predictors
  • reproducibility nightmares

🧠 What they analyzed

πŸ“Š Trends in peptide ML

The study explored:

  • predictive activities
  • ML architectures
  • citations
  • reproducibility
  • web server availability

πŸ€– Deep learning explosion

Before 2018:

πŸ‘‰ basically no deep models.

By 2021:

πŸ‘‰ almost HALF of predictors used deep learning


🧬 Most common activities

Top prediction targets were:

  • anticancer peptides
  • antimicrobial peptides
  • antiviral peptides

πŸ‘‰ AMP and anticancer prediction dominate the field.


βš™οΈ Most common ML models

The kings of peptide ML:

  • 🌲 Random Forests
  • πŸ“ˆ Support Vector Machines
  • πŸ€– Deep Learning architectures

Interestingly:

πŸ‘‰ deep learning is popular…

BUT not always clearly better.


πŸ” Key insights

⚠️ Reproducibility crisis 🚨

This was probably the biggest finding.

Among 111 analyzed tools:

  • only 38 met minimum reproducibility requirements
  • only 9 achieved β€œgold standard” reproducibility


Meaning:

❌ missing code
❌ missing datasets
❌ broken workflows


πŸ‘‰ and yes…

many web servers were dead πŸ˜…


🌐 Web servers matter more than reproducibility

Surprisingly:

πŸ‘‰ tools with active web servers got more citations

even if they were less reproducible


This creates a weird incentive:

πŸ‘‰ flashy website > robust science


πŸ€– Deep learning hype is real

Deep models became more cited over time.

BUT:

the review highlights that for AMP prediction:

πŸ‘‰ DL often does not outperform shallow models


πŸš€ Why this matters

🧠 For peptide therapeutics

Peptide ML is now central for:

  • antimicrobial discovery
  • anticancer peptides
  • neurodegeneration research

⚠️ For bioinformatics

The field needs:

  • better standards
  • fair benchmarking
  • reproducibility
  • maintenance of tools

🌍 For users

This review acts as:

πŸ‘‰ a navigation map for peptide prediction tools.


πŸ’š BioGenies perspective

This paper is basically:

πŸ‘‰ β€œsomeone had to say it” πŸ˜„

The peptide ML ecosystem is:

  • powerful
  • exciting
  • rapidly growing

BUT ALSO:

  • fragmented
  • biased
  • difficult to reproduce

And honestly:

πŸ‘‰ this paper connects almost ALL our work together:

  • AmpGram
  • CancerGram
  • AMP benchmarking
  • peptide therapeutics
  • reproducibility standards

 

Β© 2026 Website developed by BioGenies team.
Privacy Policy

Cookie Preferences