BioGenies
  • Home
  • Team
    • BioGenies team
    • BioGenies collaborators
    • Guest researchers
    • Former BioGenies members
    • About BioGenies
  • Our projects
    • OneTick
    • AMI‑CryoML
    • AmyloGraph 2.0
    • LIMAD
    • imputomics 2.0
    • FIBREA
  • Software
  • Seminars
  • Publications
  • Conferences etc.
  • Theses and dissertations
  1. Amyloid databases: mapping the aggregation universe πŸ§¬πŸ“Š
  • Our topics
    • Amyloids
    • Liquid-liquid phase separation
    • Antimicrobial peptides
    • Missing value imputation
    • HDX-MS

../../

  • πŸ”— Explore the resources
  • 🎧 Audio summary
  • πŸ”¬ What is this about?
  • 🧠 What we reviewed
    • 🧬 Sequence-based databases
    • 🧊 Structure-based databases
    • πŸ”— Interaction databases
  • πŸ“Š Key insight: fragmentation problem
  • βš™οΈ Databases β†”οΈŽ prediction tools (the feedback loop)
  • 🧬 Examples of this interplay
  • ⚠️ Key limitations (important!)
  • πŸš€ Why this matters

Amyloid databases: mapping the aggregation universe πŸ§¬πŸ“Š

publications
amyloids
A comprehensive review of amyloid-related databases, highlighting how experimental data and bioinformatics resources drive advances in protein aggregation research.
Author

BioGenies Lab

Published

October 30, 2024

Keywords

amyloids, protein aggregation, databases, bioinformatics, amyloid prediction, protein misfolding, aggregation


πŸ“Œ Project highlights

  • 🧬 Comprehensive overview of amyloid & aggregation databases
  • πŸ“Š Covers sequence, structural and interaction resources
  • πŸ”— Highlights connections between databases and prediction tools
  • ⚠️ Identifies key limitations in current resources
  • πŸš€ Provides curated list of databases: link


πŸŽ‰ New review out! This one is less about a single tool and more about the entire ecosystem of amyloid data πŸ˜„

πŸ”— Explore the resources

  • 🌐 Full database list
  • πŸ“š Paper (open access)

πŸ‘‰ This is basically a map of the amyloid bioinformatics landscape.


🎧 Audio summary

Too many databases to remember? Same πŸ˜„

πŸ‘‰ We’ve added a short audio overview 🎧 so you don’t have to memorize all of them.

Your browser does not support the audio element.


πŸ”¬ What is this about?

Amyloid aggregation is a complex, multi-factorial process involving:

  • sequence features
  • 3D structure
  • environmental conditions

πŸ‘‰ and it underlies:

  • neurodegenerative diseases
  • biotechnological challenges
  • functional biological processes

Because of this complexity:

πŸ‘‰ researchers have built many specialized databases to organize experimental knowledge


🧠 What we reviewed

We systematically analyzed amyloid-related databases, grouping them into:

🧬 Sequence-based databases

  • focus on aggregation-prone regions (APRs)
  • example: AmyLoad, AmyPro

🧊 Structure-based databases

  • store 3D fibril structures
  • example: Amyloid Atlas

πŸ”— Interaction databases

  • capture cross-interactions between amyloids
  • example: AmyloGraph

πŸ‘‰ Each database captures different aspects of aggregation.


πŸ“Š Key insight: fragmentation problem

There is no single β€œperfect” database.

Instead:

  • each resource focuses on a specific niche
  • data formats and annotations differ
  • integration is difficult

πŸ‘‰ Result:

❌ no unified benchmark dataset
❌ hard to compare prediction tools
❌ fragmented knowledge


βš™οΈ Databases β†”οΈŽ prediction tools (the feedback loop)

One of the most important conclusions:

πŸ‘‰ databases and prediction tools co-evolve

  • experimental datasets β†’ enable model development
  • prediction tools β†’ generate new hypotheses
  • new experiments β†’ expand databases

πŸ‘‰ A continuous feedback loop driving the field forward.


🧬 Examples of this interplay

  • AmyloGraph β†’ enabled PACT / AmyloComp (cross-interactions)
  • AmyloBase β†’ contributed to AGGRESCAN
  • Waltz datasets β†’ led to WALTZ algorithm

πŸ‘‰ Data β†’ model β†’ better data β†’ better model


⚠️ Key limitations (important!)

Across databases:

  • πŸ” limited search & filtering
  • πŸ“€ poor export options
  • 🧾 incomplete metadata
  • πŸ€– reliance on predictions (with biases)

πŸ‘‰ And most importantly: aggregation is not only sequence-dependent

Environmental factors matter:

  • pH
  • temperature
  • concentration
  • cofactors

πŸš€ Why this matters

This review shows:

πŸ‘‰ we have a lot of data
πŸ‘‰ but not yet fully integrated knowledge

Future directions:

  • better standardization (e.g. MIRRAGGE)
  • integration of datasets
  • ML models using multi-dimensional data
 

Β© 2026 Website developed by BioGenies team.
Privacy Policy

Cookie Preferences