The Pocketome [Ref. 1] is an encyclopedia of conformational ensembles of druggable binding sites that can be identified experimentally from co-crystal structures in the Protein Data Bank. Each Pocketome entry describes a site on a protein surface that is involved in transient interactions with small molecules and peptides. The automatic Pocketome generation procedure includes only proteins that (i) have an entry in the reviewed part of the UniProt Knowledgebase, (ii) have been co-crystallized in complex with at least one drug-like small molecule, and (iii) are represented by at least two PDB entries. As a result of manual curation, the Pocketome also contains some proteins and binding sites that do not satisfy one or more of the above requirements.
Pocketome is updated with the PDB and UniProt databases releases.
The current release 18.4, last updated 2018-04-26.
3313 entries total, at least 1502 entries from mammals.
# of pockets per entry
median pairwise pocket RMSD
Access and format
The Pocketome entries are served as data packs, each containing binding site annotation, atomic coordinates, contact matrices, and the results of pairwise structure comparison within the ensemble. These data packs are available for online viewing (interactive structure visualization is supported by the IcmJS technology) and for download in a variety of file formats including ICB (ICM Binary Format readable by Molsoft ICM Browser) and XML (Extensible Markup Language).
To speed up the molecular visualization, the Pocketome structures use a reduced atom representation for all parts of the protein except for the binding pocket. Structures of identical composition are eliminated from the online version of the entries, but not from the search index. The structures inside the Pocketome entries may also differ from the original PDB files in the following aspects:
- Spatial orientation: multiple structures of the same binding site are superimposed.
- Residue numbering: amino-acid residues are numbered according to the reference SwissProt sequence with initiating Met and signal peptide present (where applicable).
- Ligand names may differ from PDB HET codes; this is done to remove ambiguities and merge covalently linked parts of a single ligand.
- Protein molecule names and number: biological units are reconstructed in the Pocketome entries for those cases where the oligomeric partner forms an essential part of the binding pocket. Multiple parts of a single protein molecule are merged together and renumbered according to the SwissProt sequence.
- Water molecules are removed.
- Detergents, ions, cofactors are removed except for cases where they form an essential part of the binding pocket.
The term that was first introduced in Ref. 2 to signify the entire set of macromolecular binding sites for small molecules, drugs, substrates, and metabolites across the structural proteome. In the context of this encyclopedia, we are focused on binding sites with multiple three-dimensional structures of those pockets with different (or no) cocrystallized ligands which can be called experimental (or validated) pocketome [Ref. 3]. The theoretical pocketome (e.g. Ref. 4) is currently not included.
The set of amino-acid residues that have been experimentally shown to participate in ligand binding in at least one of the co-crystal complexes. The site is a superset of all pockets projected onto the amino-acid sequence of the protein.
The set of atoms that are in direct contact with the co-crystallized ligand in a single experimentally determined complex structure.
This work is partially supported by NIH grants R01 GM071872, U01 GM094612, U54 GM094618, and RC2 LM010994.