Amyloid Proteins and Peptide Dataset
Information CPAD CPAD 2.0
Dataset size Redundant data points:

Amyloid hexapeptides (GAP): 179, Amyloid hexapeptides (Waltz-DB ): 244, Amyloid peptides (different length): 332, Amorphous peptides (GAP): 168, Non-amyloid peptides (Waltz-DB): 845
Total data points: 2031
(from GAP, AmyLoad, Waltz-DB 2.0)

Amyloid-forming peptides (all length): 917
Non-amyloid peptides: 1114
Related information None Information on source protein with literature information (if available)
Derived properties None Calculated aggregation related properties for the peptides such as Net charge, Absolute charge, Hydrophobic residue percentage, Gatekeepers and position of peptide in source protein, Aggregation propensity from various servers, the orientation of fibrils (from PASTA server)

The data collected to develop GAP server (by Protein Bioinformatics Lab) has been merged with the CPAD database in current version. The source of all entries from GAP will be marked as CPAD.
Aggregation-prone regions (APRs) in amyloidogenic proteins
Information CPAD CPAD 2.0
Dataset size 33 Unique proteins 268 unique proteins, 912 APRs information
APR information Protein name, Protein sequence, Position of APR, APR sequence, Reference APR information (Length, Mutation(s), APR sequence)
Protein information (Protein Name, Species, UniProt ID, Uniprot Name, PDB ID, Catagory (Pathogenic, Functional), Prion Information, Protein sequence, APR position in the sequence with gatekeeper information)
Literature information (Pubmed link, PMID, Source database)
Aggregation Kinetics database
Information CPAD CPAD 2.0
Database size 2356 data on the change in aggregation rate upon mutation collectively 83098 data on experimental aggregation rate
(We have merged the intensity (time kinetics) values for each experiment for better representation and plotting)
82066 data points on time kinetics were merged into 1781 records.
(refer to time kinetics database or tutorial)
Objective of the database To collect the experimental information on point mutation To collect the experimental information on any kind of aggregation data
The dataset is divided into 3 parts:
(1) Aggregation rate: The aggregation rate already calculated in the literature by fitting the curve
(2) Intensity (Time kinetics): The time-dependent experimental aggregation kinetics data (mostly with characteristics sigmoidal curve)
(3) Intensity (Other): The experimental aggregation experiments performed under varying experimental condition. (observations made at either equilibrium or Max fluorescence)
Other stats Wild-type data: 586
Point mutation data: 1658
Wild-type data: 40555
Point mutation data: 39608
Double mutation data: 1293
triple mutation data: 169
Multiple mutation data: 866
Modified residue mutation data: 613
Structure of Aggregating Proteins
Information CPAD CPAD 2.0
Dataset size Does not contain structure information
(However, 23 amyloid peptides with known structures were mentioned in CPAD)
Contains 565 data with PDB information in the following sub-category:
(1) Peptides: 42 structures
(2) Protein: 159 structures
(3) Fibrils: 215 structures
(4) Aggregating complex (with ligand): 14 structures
(5) Inhibitor complex (with ligand): 130 structures
(6) Fibril complex: 2 structures
(7) Protein complex: 3 structures
There is also a separate categorization: amyloid and non-amyloid structures
Structure information No information available other than PDB ID Protein information: (Protein Name, Species, Uniprot ID, PDB ID, Length, Mutation(s), Protein sequence) Structure information: (PDB Structure, Secondary Structure, Experimental method to determine the structure, Resolution, R-value Free, PDB classification, Global Stoichiometry, Ligand information, Contact map
Literature information: Pubmed link, Author, PMID