Peptide Database

Platform Overview

Pep2Net-DB — a curated peptide evidence database integrating sequences, assays, structures, and ML predictions.

Pep2Net-DB is built on a PostgreSQL database storing thousands of peptide entries sourced from public databases (DRAMP, APD, etc.) and literature. Each entry links a peptide sequence with its biological source, functional annotations, experimental assay evidence, computed physicochemical properties, predicted 3D structures, and machine-learning-based function predictions.

The platform uses the ESM-2 + GATv2 deep learning architecture for function prediction, providing consistent and reliable results across 40+ biological activity models.

Data Highlights

What makes this database distinctive.

Integrated Sequence + Assay Evidence — Each peptide entry links directly to its experimental assay records (MIC, IC50, EC50, LD50) with target organisms, measurement units, PubMed references, and source provenance.
Predicted 3D Structures — ESMFold-generated PDB files available for structure-ready peptides, visualized interactively with 3Dmol.js in the browser.
Structure Similarity Search — Find peptides with similar 3D conformations by uploading a PDB file or referencing an existing structure. Uses PDB coordinate parsing with a precomputed signature cache.
Sequence Similarity Search — k-mer prefiltering + global alignment scoring for finding similar peptides, with adjustable identity and length tolerance thresholds.
Curated Function Ontology — Hierarchical function classification tree built from the database content, from broad Antimicrobial/Anticancer categories down to specific organism-level subtypes.
ESM-2 + GATv2 Multi-label Predictions — Toxicity bundle predicts 6 labels simultaneously (Toxin, Hemolytic, Cytotoxic, Cytolytic, Mammalian Toxicity, Celiac Toxicity). Antiviral bundle predicts 7 labels. Subtype models share the same backbone.
Streaming Data Export — Download presets for evidence-backed subsets (non-hemolytic candidates, MIC/MBC potency, antimicrobial collections) in CSV, FASTA, or JSON with streaming for large datasets.
Unique Sequence Deduplication — Browse and analyze by unique peptide sequence, not just by entry ID, to avoid inflated statistics from duplicate records.

Key Features

Six core capabilities of the Pep2Net-DB platform.

Curated Database

Access standardized peptide data with comprehensive annotations including sequence, function, source, and cross-references.

Smart Browsing

Filter peptides by type, source organism, modifications, and biological function with intuitive interfaces.

ML-Based Predictions

Single ESM-2 + GATv2 architecture for 40+ prediction tasks — Toxicity, Antiviral, Antibacterial, and more.

Flexible Export

Download datasets in CSV, FASTA, or JSON format with complete annotation information.

Interactive Statistics

ECharts-powered interactive charts showing database composition across dimensions.

Browse Database

Explore peptides through interactive filters and multiple classification dimensions.

The Browse page lets you navigate the entire peptide database through multiple classification dimensions. Clicking any entry opens a detailed single-peptide view.

Filter Dimensions

Peptide Type

Linear: Straight chain peptides
Cyclic: Head-to-tail cyclized
Branched: Multi-chain structures
Other: Special topologies

Source Organism

Animal: Mammalian, amphibian, insect
Plant: Botanical sources
Bacterium: Microbial-derived
Synthetic: Lab-designed

Modifications

N-terminal: Acetylation, formylation
C-terminal: Amidation, carboxylation
Chemical: Side chain modifications

Biological Function

Antimicrobial: Antibacterial, antifungal
Therapeutic: Anticancer, anti-aging
Functional: Signaling, cell-penetrating

Single Peptide View

Clicking any entry opens a comprehensive detail page featuring the full amino acid sequence, all computed physicochemical properties (molecular weight, pI, net charge, GRAVY, hydrophobic moment, instability index, Boman index), functional annotations, and linked assay evidence records.

Interactive 3D Structure Viewer — powered by 3Dmol.js, predicted PDB structures are rendered directly in the browser with zoom, rotation, and style toggles. Previous / Next peptide navigation enables seamless browsing through the database.

Function Ontology Browser

A hierarchical function ontology tree built from the database lets you navigate from broad categories (e.g., Antimicrobial) down to specific subtypes (e.g., Anti-Gram-positive, Anti-MRSA), with counts showing how many peptides belong to each node. Filter controls allow narrowing by evidence status, sequence length, and standard/non-standard sequences.

Other Browse Views

Unique Sequence Browse — Deduplicated view grouping all entries by unique peptide sequence.
Assay Record Browse — Paginated raw experimental evidence table with target organisms, assay methods (MIC, IC50, EC50, LD50), and activity values.

Search

Find peptides by sequence, name, cross-reference, or structural features.

The Search page offers two search modes with downloadable results (CSV export, up to 50,000 records):

Sequence Search

Search by peptide name (e.g., LL-37), amino acid sequence, or database cross-reference (DRAMP ID, APD ID, PubMed ID). Apply additional filters for sequence length, molecular weight range, isoelectric point, source organism, and function category.

Sequence Similarity Search — paste a query sequence to find similar peptides using k-mer prefiltering followed by global alignment scoring. Adjustable parameters include minimum identity, length tolerance, maximum candidates to scan, and result count.

Structure Search

Search by 3D structural similarity in two ways: select an existing peptide from the database (using its predicted PDB structure), or upload your own PDB file. The search generates a structure signature from PDB coordinates and compares against a precomputed cache for fast similarity ranking, with configurable score threshold and bounded candidate scanning.

Prediction Tools

Compute physicochemical properties and run ML-based function predictions.

The Tools page provides two categories: Physicochemical Property Calculations (deterministic) and Machine Learning Function Predictions (ESM-2 + GATv2).

How to Use

1

Input Sequences

Paste one or more peptide sequences (one per line) or upload a FASTA/CSV file.

2

Select Models

Choose from available physicochemical calculators and ML function classifiers.

3

Run Analysis

Click Submit to execute all selected predictions. ML results include confidence scores.

4

Export Results

Download prediction results as a CSV file for offline analysis.

Physicochemical Properties (Deterministic)

Computed without machine learning — fast and deterministic:

ESM-2 + GATv2 Architecture

Unified deep learning architecture for all function prediction models.

All ML-based function prediction models share a unified architecture:

ESM-2 Backbone: 650M-parameter protein language model (Meta AI) generates rich per-residue sequence embeddings that capture evolutionary and structural information.
GATv2 Layers: Graph Attention Networks v2 with dynamic attention mechanism process the sequence as a graph, learning residue-residue interaction patterns.
Bundle Models: Top-level models (toxicity, antiviral) perform multi-label classification (6 and 7 labels respectively). Selecting individual subtypes automatically promotes the parent bundle for efficient inference.
Single-Label Models: Subtype models for antibacterial, therapeutic, antifungal, and other categories use specialized prediction heads sharing the same backbone.

Prediction Workflow

Sequence embedding generation using ESM-2
Graph construction and feature propagation
Attention-based feature aggregation with GATv2
Task-specific classification and confidence scoring

ESM-2 + GATv2 unified architecture for peptide property prediction

Available Predictions

40+ trained models using the ESM-2 + GATv2 unified architecture.

Antimicrobial Activity

Therapeutic Activity

Functional Peptides

Plant & Defense

Toxicity

Data Download

Export curated peptide datasets with complete annotations.

The Download page provides curated preset data packages for offline analysis in CSV, FASTA, and JSON formats, with streaming export for large datasets.

Available Preset Packages

Evidence-Based Subsets

Verified Peptide Index — experimentally verified entries
Activity-Supported Peptides — entries with quantitative evidence
MIC/MBC Potency — peptides with potency data
PMID-Supported Evidence — literature-linked records

Function-Focused Collections

Antimicrobial Sequences — antibacterial, antifungal set
Antibacterial / Antiviral / Anticancer — activity-specific exports
Non-Hemolytic Candidates — therapeutic development subset
Structure-Ready Index — peptides with predicted 3D structures

Source-Specific Packages

Animal / Synthetic / Bacterium
Plant / Virus
Verified Assay Evidence

Statistics

Interactive ECharts visualizations summarizing database contents.

The Statistics page presents interactive charts built with ECharts, backed by a precomputed JSON cache for fast loading:

Peptide Type Distribution — pie/bar chart across linear, cyclic, branched, and other topologies.
Source Organism Breakdown — distribution across animals, plants, bacteria, viruses, fungi, and synthetic sources.
Function Category Overview — top functions, sources, and assay methods ranked by count.
Sequence & Property Distributions — histograms of sequence length, molecular weight, pI, net charge, GRAVY, hydrophobic moment, instability index, and Boman index across the entire database.
Evidence Coverage — verified vs unverified peptides, PMID-linked records, unique vs duplicate sequences, and standard vs nonstandard amino acid composition.
Top Studied Peptides — peptides with the most assay records and activity evidence.

All charts support hover details, zoom, and series toggling.