DISCO - Directory of Structures for Crossdocking

To generate a benchmark for your desired targets, use our free on-line system, following the instructions below...

Submit the email address to which the results should be sent. This is done only for the purpose of validation and providing results. None of your information will be retained.
Submit the 8 character validation code which will be emailed to you.
Provide the input for the targets you wish to generate crossdocking sets for. Your input should be formatted as follows...

Your input must be a list of targets separated by new lines.
The input for each target should contain three fields separated by white space...
- Target Name - How the target will be referenced in the directory structure (Optional)
- PDB ID of Target Reference - The 4 character Protein Data Bank ID for your receptor
- Ligand ID - The 3 character ID for the ligand of interest in your receptor
Sample input (3 field mode)
- AA2AR 3EML ZMA
Sample input (2 field mode, note the Target name will default to the PDB ID)
- 3EML ZMA
Note that the submission form only checks for the validity of your input's syntax. Any Target for which the PDB ID or Ligand ID does not exist will be eliminated.

After submitting your input, your results will be emailed to you as soon as they are ready. This may take several minutes to several hours or more, contingent upon size of the request and traffic.
Your results will be made available on the site for 24 hours after you are emailed.

Getting Started

If you are having trouble getting started, or selecting the proper input, consider visiting the Protein Data Bank to identify potential protein structures from which to construct a dataset. Please be sure that each protein you wish to generate homologous receptor-ligand pairs for contains a valid ligand. The PDB information page for each structure contains a list of ligands towards the bottom. Your input must include the 4 character PDB ID of the structure as well as the 3 character ligand ID for your ligand of interest.

How the CrossDocking Generator Works

The CrossDocking Generator compiles a crossdocking benchmark for the user's Targets of interest. The generator returns a dataset consisting of all of the known structures containing an inhibitory ligand that are homologous to the given Target. These structures are separated into ligand and receptor .pdb structures. For convenience, all receptor structures are stripped of any co-factors and all structures are aligned to the reference PDB ID for the Target. The resulting dataset can be used as a benchmark in crossdocking studies either by docking the various ligands from each receptor-ligand set to one reference receptor, or by docking the ligands across various receptors. For each Target, the following steps are carried out...

The user specified PDB ID is downloaded and the specified ligand is extracted.

This results in two files...
- PDBID_PRO.pdb - The receptor structure stripped of all co-factors and ligands
- PDBID_LIG.pdb - The specified ligand of interest
If the PDB ID is invalid or does not contain the specified ligand, the Target is rejected and a note is made in the ERROR_LOG.txt file

All known structures with 80% homology to the reference receptor are considered. Those structures containing a known inhibitory ligand are retained.

Structures that do not contain a known inhibitory ligand are rejected
- Any structure that is rejected and its reason for rejection is noted in the PDBS_considered.txt file

These structures are separated into receptor and ligand files.
Structures are aligned to the reference receptor so as to allow easy comparison between known and docked ligand position.

Structures may be rejected at this point for the following reasons...
- The structure does not properly align to the receptor
- The structure does not contain a ligand in the same binding pocket as the reference receptor
- The structure contains multiple ligands or co-factors within the vicinity of the selected ligand (this would interfere with docking simulations)
The ligand that most closely fits the binding pocket of the reference ligand is retained
All other ligands and co-factors are stripped from the receptor file along with any extraneous polymer chains

The final directory for each Target consists of...

PDBS_considered.txt - A list of all those PDB IDs that were considered but ultimately rejected, along with the reason for rejection
PDBS_kept.txt - A list of all those PDB IDs that were retained in the dataset
lig_map2.txt - A map for PDB ID to LIG ID (during the processing each ligand is renamed to LIG for convenience)
PDB_Structures - A directory containing all of the separated receptor and ligand structures

Troubleshooting

If errors occur in processing any Target, they will be noted in the ERROR_LOG.txt file in your results. The problem likely stems from one of the following...

The PDB ID is invalid.
The PDB structure does not contain a ligand with the ligand ID specified.

Please note that the system automatically rejects common co-factors based on a blacklist of known co-factors and molecular weight (a valid ligand must have Molecular Weight > 150 D

The generator failed to identify any homologous structures containing a valid ligand. Either...

The PDB ID does not have any known homologous structures
None of the homologous structures contain a valid ligand
None of the homologous structures could be aligned properly
- Please note that the system rejects structures with a co-factor within the same binding pocket as the ligand. As a result some structures are rejected (perhaps unnecessarily). For example, structures containing an ionic co-factor conjugated to the receptor at the binding site will be rejected. Interactions such as these may not be accounted for by docking software, so potentially troublesome cases such as these have been eliminated from this benchmark.

Cross Docking Benchmark is provided by the Camacho Laboratory at University of Pittsburgh and is developed by Shayne Wierbowski of the University of Scranton and Jim Zheng of the University of Pittsburgh.