The DR_bind DNA-binding residue prediction webserver implements a structure-based DNA-binding residue prediction method based on (a) electrostatics, (b) conservation, and (c) geometry with the following rationale: (a) DNA-binding residues contain electropositive atoms, which would be in an unfavorable electrostatic environment in the absence of DNA or water; thus replacing one of these residues with a negatively charged Asp−/Glu− would alleviate the electrostatic repulsion among the electropositive atoms in the gas phase. (b) DNA-binding residues and residues in the vicinity, which form a cluster of spatially interacting residues, are usually highly conserved within the same family due to their critical functional roles. (c) DNA-binding residues have been observed to be located on surface patches, as opposed to clefts/cavities for RNA-binding residues and enzyme substrates.
An aa X is considered accessible for interacting with DNA if the percent ratio of its side-chain solvent-accessible surface area in the protein to that in the tripeptide, −Gly−X−Gly−, is >5%. MOLMOL was used to compute the relative solvent-accessible surface area of each aa from the protein structure using a solvent probe radius of 1.4 Å.
Since DNA-binding sites are found on a protein surface, surface patches were generated by defining the Ca atom of each residue as an origin of a patch and including all residues whose atoms were within 10 Å of the origin in the patch. Non-identical patches with >5 solvent-accessible residues were used in computing the average electrostatic energy change and conservation (see below).
Given al-residue DNA-binding protein structure, all Asp/Glu residues were deprotonated, while Arg/Lys residues were protonated; His residues were protonated or deprotonated depending on the availability of hydrogen-bond acceptors in the structure. Next, l mutant structures were generated by replacing Ala, Asn, Asp, Cys, Gly, Ser, Thr, or Val in the wild-type structure to Asp− and the other residues to Glu−. The side chain replacements were carried out using SCWRL, followed by energy minimization with heavy constraints on all heavy atoms using AMBER to relieve any bad contacts. Based on the wild-type/mutant structures, the gas-phase (e = 1) electrostatic energy of the wild-type (Eelecwt) or mutant (Eelecmut) protein in the folded state relative to that in an extended reference state (E′ elecwt or E′ elecmut) was computed using AMBER with the all-hydrogen-atom AMBER force field. In this extended reference state, the residues do not interact with one another; hence, the electrostatic energy difference between the wild-type (E′ elecwt) or mutant (E′ elecmut) unfolded protein is equal to the difference between the electrostatic energies of the native residue at position i (E′ eleci) and the corresponding mutant Asp−/Glu− (E′ elecD/E). The change in the gas-phase electrostatic energy ΔΔelec upon mutation of residue i to Asp−/Glu− is given by:
ΔΔeleci = (Eelecmut,i− Eelec) − (E′ elecD/E− E′ eleci) | (1) |
The average electrostatic energy change <ΔΔelec>i of the Naai residues comprising surface patch i was computed from:
<ΔΔelec>i = ΔΔelecj / Naai | (2) |
where the summation in Eq. (2) is over all residues in patch i.
For a given DNA-binding protein, the conservation score i, of residue i was obtained from the ConSurf-DB database or ConSurf server. The Ci score is an integer number, ranging from 1 (for a rapidly evolving, highly variable residue) to 9 (for a slowly evolving, conserved residue). The average conservation <C>i of the Naai residues comprising surface patch i was computed from:
<C>i = Σj / Naai | (3) |
To determine the DNA-binding residues in a given protein, the distinct patches were ranked according to the <ΔΔelec>i values so that the top-ranked cluster had the most favorable (most negative) <ΔΔelec>i, whereas the bottom-ranked cluster had the least favorable <ΔΔelec>i. Among the top 10% <ΔΔelec>i-ranked surface patches, the three patches with the largest <C>i values were selected and the constituent solvent-accessible residues were predicted to bind DNA.