[POCKETOME]
 

Access and format

Online search

There are several online search options:

  • Pocketome;
  • Protein names;
  • Protein families;
  • Protein domains/regions;
  • PDB codes;
  • Ligand PDB Het IDs;
  • Compound names;
  • SwissProt knowledgebase.

Search keyword can be an extended alpha-numeric pattern of a maximal length of 100 characters ([a-zA-Z0-9_,. -]{1,100}). Additionally, search keywords are filtered in accordance with specific search types (e.g. PDB search requires exactly 4 characters). If the length of the search keyword after filtering is 0, then server will return 'Illegal search term' message. If the search keyword contains capital letters, case-sensitive search will be performed (only for protein names, families, domains/regions), otherwise search is not case-sensitive.

Pocketome search is a general search in all available indices. Results of such a search may contain non-relevant hits. To reduce their number, try to increase length of the search keyword (>4 characters).

Protein name, family or domain/region search is performed for both primary and second protein chains (if applicable). These search types are case-sensitive if the search keyword contains at least one capital letter.

Compound name search is a two-step search. First, the search keyword retrieves relevant PDB Het IDs (max 20 IDs) and these results are displayed in mixed text-graphical mode that allows a user to refine the search query by using specific Het ID. Second, search by all the Het IDs is automatically performed and these hits are displayed just after the Het list.

SwissProt knowledgebase is a Pocketome search that is powered by the SwissProt search engine which returns hits ordered by relevance. The search is performed in two steps: first, relevant protein names are retrieved from SwissProt knowledgebase and then these protein names are searched inside the Pocketome.

Online search returns a list of Pocketome entries that match to the search query. For each entry several information blocks are provided, including brief description of the protein, domain annotation, list of PDB and HET IDs, entry pictogram, and two links that point to the individual entry pages with and without 3D visualization ([text] links). Long lists of PDB and HET IDs are folded to one line to reduce total vertical size of the entry list. To show the full list of IDs for the entry, hover the cursor over the list; to display full list of IDs for all found entries, click on the [expand] link (this enables Ctrl+F browser functionality).

The entry pictogram illustrates several properties of the binding site:

  • pictogram area is proportional to the binding site volume;
  • pictogram aspect ratio is equal to the ratio of the first two principal axes of the binding site (width is always the first axis);
  • pictogram frame represents buriedness of the binding site with black (protein) and cyan (solvent) ratio being a median buriedness;
  • rectangular areas inside the pictogram frame represent relative contributions of different classes of protein groups: hydrophobic (yellow), negatively charged (red), positively charged (blue), noncharged polar (green) side chains and backbone (gray).

GET request

http://pocketome.org/index.cgi?act=Go&searchTerm=GPCR&searchType=Pocketome&format=plain

format

format=plain means that response will be in BSV format. The first line contains the header with column names as follows:

id
the main identifier used for filenames and links;
protNameA
protein name of the master chain;
protFamilyA
protein family of the master chain;
protDomainA
protein domains/regions of the master chain;
PDBfull
CSV list of all PDB ids associated with the entry;
HETs
CSV list of HETs;
protSwissB
SwissProt mnemonic name of the second protein chain, if any;
protNameB
protein name of the second chain;
protFamilyB
protein family of the second chain;
protDomainB
protein domains/regions of the second chain;
orgClass
CSV list of organism classes, e.g. mammal;
volume
shape
buriedness
negative
positive
polar
backbone
hydrophobic
see above for pictogram description.

If the search term is not found, there will be no data rows (only the header is returned). If the length of the search pattern after filtering is 0, then server will return 'Illegal search term' message instead of the header.

If the format option is not provided, or its value is not plain, then server will return HTML output as usual.

format=plain option supports only two actions: Go and browseall. If non-supported action is provided simultaneously with format=plain, then server response will be empty.

act

There are several actions:

Go
search;
abstract
home page;
browseall;
documentation
this page;
citation;
contact;
disclaimer.

If no act option is provided, or its value is not recognized, the server will return the home page.

searchTerm

searchTerm can be an extended alpha-numeric pattern of a maximal length of 100 characters ([a-zA-Z0-9_,. -]{1,100}). Additionally, search patterns are filtered in accordance with specific search types.

If the length of the search pattern after filtering is 0, then server will return 'Illegal search term' message in both HTML or plain-text formats.

If the searchTerm value contains capital letters, case-sensitive search will be performed (only for proteinname, proteinfamily, proteindomain values of searchType option), otherwise search is not case-sensitive.

If the search pattern is not found, HTML output will contain 'Search term not found' message. If the format=plain is provided, BSV response will contain only the header.

searchType

There are several search types available:

  • Pocketome;
  • proteinname;
  • proteinfamily;
  • proteindomain;
  • pdbid;
  • hetid;
  • chemname;
  • swissprot.

If the value of searchType option is not recognized, then server will return the home page.

Online Pocketome entry

Online entry access functionality includes retrieval of related entries from other linked databases and study of interactive views of individual pocket structures and ensembles that are dynamically generated in response to mouse clicks in the HTML part of the entry page:

  • header clicks;
  • contact maps clicks;
  • pairwise comparison matrices clicks;
  • entry download.

Header contains detailed description of the binding site and links to sources of bioannotation. If a recent version of ActiveICM browser plugin is detected, the header will also provide a set of ICM controls to customize interactive 3D visualization. Hover a cursor over the control to see a tooltip with a brief description of the control.

Contact maps have one (site page) or two (pocket page) clickable and sortable columns: PDB.ch and ligand. Click on particular link in a cell in the ligand column will show individual binding pocket. Click on link in a cell of the PDB.ch column will display the binding pocket with the full length protein chains (pocket page) or highlight it within the ensemble (site page).

Pairwise comparison matrices page appears only for ensembles of two or more structures and contains controls to customize view of the matrices. Zooming functionality is designed to provide the best fit width possible (compare with simple change of font size). Sometimes (for XL ensembles) table is so wide that it cannot fit a single screen. To capture the whole picture the matrix can be displayed as an image.

Single entry download

A Pocketome entry can be downloaded in a number of formats using links provided in the individual entry pages.

id.icb

The whole entry in ICM binary format. This format provides the most powerful access with ICM or free ICM Browser software. Such icb file for the Pocketome entry contains tagged ensemble of 3D structures and a full set of HTML pages. Each file includes interactive help page.

id.xml

The text representation of the entry in XML format. Contains protein annotation, contact maps and comparison matrices, but doesn't include 3D coordinates. See below for the Pocketome entry XML schema.

id.png

A static 3D visualization.

id_{pocket,site}.tsv

Two contact maps in TSV format.

id_{compatibility,clashes,rmsdBB,rmsdTot}.{txt,png}

Four comparison matrices in plain text and PNG formats.

TSV contact maps

Each Pocketome entry has two contact maps in TSV format (id_pocket.tsv and id_site.tsv), available for download. These two maps provide the same information as online pages pocket and site do.

Structure of the TSV files is the following: comment section with site annotation, header string and data section (one row for each object). This file can be read into ICM:

read table separator="\t" "id_site.tsv" header comment

Comment section contains site annotation:

# id>>id
main ID of the site, used for file names;
# A1 10 1,3:11 D:_domain_name_(3:11);_R_region_name_(5,6) PROT_ORG P12345
description of each protein chain: chain identifier, number of site residues, list of site residues, domains/regions annotation, SwissProt mnemonic name (for A1/B1 chains only), SwissProt accession number (for A1/B1 chains only);
# cF cofactor1/cofactor2
cofactors list (if any);
# Me Me1/Me2
metal ions list (if any);
# pdblstp>>pdb1,pdb2
list of processed PDBs;
# pdblstu>>pdb1,pdb2
list of unprocessed PDBs (if any);
# pdblstr>>pdb1,pdb2
list of redundant PDBs (if any).

The header string and data scheme:

pdb.ch
PDB and master chain;
XRes
resolution of the X-ray structure in Å;
lig
ligand; usually it is a single HET code (lower case) or _ in case of an apo structure; if the ligand molecule consists of standard amino acids or nucleotides, it is abbreviated with standard 1-letter code (upper case); if the ligand molecule consists of several residues, they are delimited with a dot (.) when necessary (between two consecutive HET codes); if the ligand molecule consists of several residues and not all residues of the ligand molecule are considered as a ligand, only ligand residues of the ligand molecule are listed (common case is a protein ligand); non-consecutive ligand residues within the ligand molecule are delimited with a dash (-); if the ligand consists of several molecules, they are delimited by a comma (,);
lig_Nat
total number of atoms of the ligand (0 for apo structures);
residue_cont
for every residue of each chain: 1-letter code for contact with the ligand (id_pocket.tsv) or the ligands (id_site.tsv); contact code is one of the following: B backbone contact, S side chain contact, F both backbone and side chain contact, C covalent bond, . no contact, X clash (id_site.tsv only);
residue_mut
for every residue of each chain: mutation (1-letter abbreviation for standard residues or HET code) or - for deletion or blank;
residue_info
for every residue of each chain: additional information (covlig tag and/or clashes annotation) or blank; id_site.tsv only: [covlig][[;]clashes]; clashes has the following format: lig.clash(number_of_clashes):pdb.ch:ligand[|pdb.ch:ligand];
cF
cofactors (if any);
cF_cont
1-letter code for contact of cofactors (if any) with the ligand (id_pocket.tsv) or the ligands (id_site.tsv); contact code is one of the following: M contact, m no contact;
Me
metal ions (if any);
Me_cont
1-letter code for contact of metal ions (if any) with the ligand (id_pocket.tsv) or the ligands (id_site.tsv); contact code is one of the following: M contact, m no contact.

Comparison matrices

The third tab pairwise comparison contains four comparison matrices. These matrices exist in three different formats: HTML tables (default), PNG images and downloadable text files.

Structure of a comparison matrix text file:

id>>id
main ID of the site, used for file names;
number_of_objects
pdb.ch cluster_mask ligand
for each object (table head): PDB ID and master chain, cluster_mask is 1 if the next object belongs to the next cluster and 0 otherwise, ligand is blank in case of an apo structure;
>> cluster_number pdb.ch
for each object (table leftmost column);
r_value hex_color
for each cell; id_compatibility.txt only: negative r_value means an apo structure.

All entries

A single compressed (.tbz) file containing all Pocketome entries in XML format is available for download:

pocketome-xml.tbz

Release 16.12 (2016-12-05)
size: 6.5 Mb
md5: 422b377adcaad07596c53122aaf9da08

XML schema

The XML schema describes structure of a Pocketome entry XML file. Every Pocketome entry XML file was validated against the schema and the additional constraints on value space.

pkentry-105.xsd

Pocketome entry XML schema v. 105 (2014-07-08)
md5: f6102912dcbe06745d1ce76ed2a24685