BindUP-Alpha - Predicting DNA- and RNA-Binding Proteins based on Experimental and Computational Structural Models

BindUP-Alpha Manual

Single protein mode

Input
Calculation options

Chain selection
Patches calculation

General options
Results

Patches visualization
Files for download

Batch mode

Input
Calculation options
General options
Results

Single protein mode

In this mode, BindUP-Alpha calculates a specific chain or all the protein chains of one protein structure. The results include the NA-binding prediction of each calculated chain and the requested electrostatic patches visualized by ChimeraX.

Input

BindUP-Alpha requires a coordinate file of a protein structure, solved experimentaly or predicted, in PDB or mmCIF format. The input can be given in one of two ways:

Supply a valid protein ID:

A valid PDB ID (a four alphanumeric characters identifier, taken from RCSB PDB, case insensitive). In this case, BindUP-Alpha retrieves the results from a database of pre-calculated predictions, so the results presentation is very fast.
A valid Uniprot ID or a Computed Structure Model (CSM) ID (a six or ten alphanumeric characters, for CSM - wrapped by "AF-XXXXXX-F1-vX" identifier, taken from RCSB PDB or Alpha-Fold, case insensitive). In this case, BindUP-Alpha retrieves the results from a database of pre-calculated predictions, so the results presentation is very fast.

Upload a user-defined coordinate file (of a known structure or a model) in PDB format (view example) or in mmCIF format (view example). In this case, the calculation is performed in real time. The calculation time can vary from few seconds to several minutes, depending on the size of the protein structure.

Please note: the following types of protein structures cannot be analyzed by BindUP-Alpha:

Protein chains containing less than 20 residues.
Protein chains containing unknown (UNK) residues.
Protein chains in which there are missing backbone atoms.
Protein chains in which more than 10% of the total heavy atoms are missing.
Proteins containing multiple character chain ids.
Proteins containing more than 62 chains.
Proteins containing more than 99999 ATOM coordinates.
Proteins that have complex beta sheet topology, see more details.
Proteins containing B-factors larger than 999.99.
Proteins that have chemical IDs (for ligands and chemical components) that are 5 characters long.

Calculation options

Chain selection

By default, the calculation is performed on all the protein chains (each protein chain is calculated separately). However, it is possible to select a specific chain identifier. In such case, only the selected chain will be calculated and displayed in the results.

Patches calculation

By default, BindUP-Alpha displays the largest positive electrostatic patch. However, it is possible to choose whether to display only positive patches, only negative patches or the combination of both.
Number of patches to display: When calculating only one type of patches, BindUP-Alpha can display up to 3 patches together. When calculating both positive and negative patches, BindUP-Alpha can display only the largest patch of each type.

General options

Email address: The E-mail address is an optional field, required in order to get a link to the results page. If you don't get an E-mail from BindUP-Alpha within a reasonable time, check your spam folder, it might accidentally get there.

Results

In single protein mode, the results for each calculated protein chain include four components (view an example of the results page):

A prediction for nucleic-acid binding. Please note predictions are calculated for each chain separately.
A visualization of the requested electrostatic patches using ChimeraX.
A summary text file, listing the residues composing each calculated patch and the NA-binding prediction as well as its confidence score.
A coordinate mmCIF file, including the patches annotation inserted in the B-factor column.

In cases the calculation is performed on several protein chains, all the valid chains are initially displayed. The presented results can be changed to each of the other protein chains by using the drop-down menu.

Patches visualization

BindUP-Alpha visualizes the requested patches on the protein chain surface using ChimeraX (a next-generation molecular visualization program).

The positive patches are colored in shades of blue (dark to light blue for the first to the third largest positive patch, respectively).
The negative patches are colored in shades of red (dark to light red for the first to the third largest negative patch, respectively).
When displaying the largest positive and negative patches together, the common residues are colored in purple.

Files for download

Summary text file: This file includes the NA-binding prediction for the chain and its confidence score and lists the residues composing each calculated patch (view example).
mmCIF file: A coordinate file in nnCIF format, containing only the ATOM lines of the current chain and the calculated patches annotation (view example). This file provides the flexibility for advanced users to color the patches and present them in any molecular visualization program such as ChimeraX or PyMol. The patches annotation is inserted to the B-factor (temperature) column, according to the following code:
- 10 - atoms that belong to the first positive patch (BindUP-Alpha color code: [0,0,198]).
- 20 - atoms that belong to the second positive patch (BindUP-Alpha color code: [19,131,253]).
- 30 - atoms that belong to the third positive patch (BindUP-Alpha color code: [156,203,254]).
- -10 - atoms that belong to the first negative patch (BindUP-Alpha color code: [200,0,0]).
- -20 - atoms that belong to the second negative patch (BindUP-Alpha color code: [254,120,105]).
- -30 - atoms that belong to the third negative patch (BindUP-Alpha color code: [254,192,182]).
- 5 - atoms that belong to both the first positive patch and the first negative patch (BindUP-Alpha color code: [182,40,217]).
- 0 - atoms that do not belong to any of the requested patches (BindUP-Alpha color code: [232,232,232]).

Presentation of all the protein chains together

When selecting the option 'all' in the drop-down menu, BindUP-Alpha presents the results for all the calculated protein chains together (view an example of this presentation). Please note that this is only a way of presentation. The calculation is always performed on each chain separately.

Batch mode

This mode enables to run in a batch more than one protein structure. The results, including the NA-binding predictions and the electrostatic patches, are given in text files only, without the 3D visualization.

Input

BindUP-Alpha requires a list of IDs (case insensitive) separated by newline. The number of entries is not limited. The entries can be written in three forms:

PDB ID (for example: 6kda). In this case, all the protein chains of the entry will be calculated.
A PDB ID + a specific chain ID (for example: 6kda). In this case, only the specified chain will be calculated.
A Uniprot or a CSM ID (for example: Q9UBC3 or AF-Q9UBC3-F1-v6).

The list of ID entries may contain the three forms mixed together (view example).
It is possible to paste the list of entries to the browser or upload a text (.txt) file containing the list.

Calculation options

Patches calculation

By default, BindUP-Alpha calculates the largest positive electrostatic patch. However, it is possible to choose whether to calculate only positive patches, only negative patches or the combination of both.
Number of patches to display: When calculating only one type of patches, BindUP-Alpha can display up to 3 patches together. When calculating both positive and negative patches, BindUP-Alpha can display only the largest patch of each type.

General options

Job name: An optional parameter that enables you to give your job an informative name. If not filled, the job gets a unique number identifier.

Results

In batch mode, the results are presented for each protein structure and include the text files (described above) only, without a 3D visualization (view an example of the results page). If the user has specified a chain identifier, both the summary text file and the mmCIF file will include the results for the specified chain only. Otherwise, the text files will include the results for all the calculated protein chains.
In case there is more than one valid entry, an additional file, which summarizes the results for all the entries together, is provided for download as well.