iGEM Patras 2021


The hardest work begins in dry docks.

  • All
  • Introduction
  • Theoretical Background
  • Target Selection and Preparation
  • Ligand Selection and Preparation
  • Docking
  • Evaluation of docking results


Development in the field of informatics and computing power facilitated the use of Virtual screening in the development of new drugs. This development, alongside the ease of access in protein databases and small molecules, made molecular Docking bound to happen. The purpose of molecular docking software is to evaluate and predict molecular recognition and interaction, both structurally presenting likely binding regions and modes and energetically calculating the binding affinity of those bonds. This mode is called ligand-protein Docking; however, its use is not limited in this mode. It is more often to use Molecular Docking for predicting the interaction between a small molecule and a target macromolecule. The growing interest in finding the interactions between macromolecules made Protein-Protein docking come true. Structure-activity studies, virtual screening for potential lead compound and their optimization, providing binding hypotheses to aid predictions for mutagenesis studies, chemical mechanism studies, and combinatorial library design seem to be just a few of the uses and applications of molecular, making molecular Docking a profitable tool in drug discovery.

Theoretical Background

Despite the fact that there are many molecular docking tools and methods, all of them follow a specific workflow that consists of three basic steps:

  • Target Selection and Preparation
  • Ligand Selection and Preparation
  • Docking
  • Evaluation of Docking Results

This workflow takes place to facilitate finding the most favorable binding mode between the ligand and the target of interest independently of the methodology. The ligand's binding mode is characterized by its state variables like the spatial position (x-y-z) orientation and, if flexible, its conformation. While finding the state variables is done by a search method to calculate modeling's precision and validate the ligand's binding mode, a scoring function is required.

Scoring functions can be empirical, force field-based, or knowledge-based, whereas search methods can be subcategorized into Systematic and Stochastic. In the systematic method, which is a deterministic approach, the variables are tested in predefined intervals. On the other side, stochastic search methods make random changes continuously until a pre-decided criterion to terminate the procedure is met. Search methods can also be differentiated in global and local depending on how widely the search space will be explored. Local search methods look for the minimum energy closest to the existing conformation, whereas global search methods look for the best or global minimum energy within the designated search space. Hybrid global–local search strategies have been demonstrated to outperform global approaches alone in terms of efficiency and ability to find lower energies. Among other 3 Dimensional structural modeling programs, Autodock, which is also the one that we used for our computation, offers to the user the choice of two local search methods (Solis/Wets and Pattern Search), two global search methods (simulated annealing (SA) and the genetic algorithm (GA)) and one hybrid global–local search method, the Lamarckian GA (LGA).

Target Selection and Preparation

Ideally, the target molecule should be experimentally determined frequently by using X-ray crystallography, NMR, and Cryo-EM, however similarly to what we did in our project, docking calculations have been performed to homologous models. In that case, the precision and reliability of the docking results depend almost exclusively on the quality of the homologous model. The target structures are obtained from databases, and the higher the experimental method's resolution, the higher the Docking's precision. Moreover, some docking methods create a grid which is to represent possible receptor regions. That way, the calculation procedure is accelerated by limiting the possible areas of interaction between the ligand and the target molecule.

Ligand Selection and Preparation

For lead compound discovery, basic filters such as net charge, molecular weight, polar surface area, solubility, and commercial availability can be used to minimize the number of molecules to be docked. Similarity thresholds, pharmacophores, synthetic accessibility, absorption, distribution, metabolism, excretion, and toxicity (ADME-Tox) qualities are all used in the lead optimization process. Finally, a unique library of analogs related to lead compounds is frequently built for Docking to inform and prioritize medicinal chemistry activities for focused lead optimization. For the ligand and receptor, AutoDock employs a united-atom model in which only polar hydrogens are present. It also necessitates the assignment of partial atomic charges to the ligand. The AutoDock scoring functions were calibrated using Gasteiger partial charges on the ligand. Hence the ligand must be allocated Gasteiger partial charges in order to use the scoring functions accurately.

It should be noted that AutoDock has successfully employed alternate charge calculation algorithms for ligands. With the exception of ring conformations, most docking technologies treat ligands with flexibility. In general, the more rotatable bonds a ligand contains, the more complex and time-consuming Docking becomes. This is because the size of the search space grows exponentially as the number of torsions grows. Consequently, the rotation of conjugated bonds, such as amidic and those seen in carbamates and ureas, should be kept to a minimum.


As mentioned earlier, molecular docking methods illustrate searched spaces that were computationally explored (each method in each unique way) and determine the best binding mode by its properties. These search methods are categorized as systematic or stochastic.

SynBio is a mixture of the scientific fields of Biology, Biotechnology, Systems biology, Engineering, Chemistry, Mathematics, and Bioinformatics. The field of Bioinformatics is the one that consists of understanding biological data methods and the one that we dealt with through our project. The Next-Generation sequencing in Project PGasus is an informative example of what Bioinformatics is capable of. Next-generation sequencing (NGS) is a massively parallel sequencing technology that offers ultra-high-throughput sequencing and time-efficient data analysis. Nanopore sequencing is a unique, scalable technology that enables direct, real-time analysis of long DNA or RNA fragments. It works by monitoring changes to an electrical current as nucleic acids are passed through a protein nanopore. The resulting signal is decoded to provide the specific DNA or RNA sequence. So, this year, common, rare, and novel PGx variants that were detected with the help of Oxford’s Nanopore’s NGS and principles of Bioinformatics, one of the seven parts of synthetic biology, were properly utilized for their functional characterization.


  • Deterministic outcome
  • Quality is based on the degree of detail on the search space
  • Used in rigid protein-rigid protein docking


  • Based on randomness
  • Suitable for high-dimensional calculations (flexible ligand-protein Docking)

The AutoDock scoring function is based on the molecular mechanic's force field AMBER. With two additional terms: one to model the desolvation free energy change on binding, which is based on atomic solvation parameters, and one practical term to model the loss of conformational entropy on binding.

The individual contributions to the total energy of binding, such as the possible interactions between ligand and macromolecule and the ligand's flexibility, are treated as independent variables.

Evaluation of docking results

Regardless of the ligand-protein docking tool used, docking results should be evaluated by considering the chemical complementarity between ligand and protein. The parameters chosen for the Docking can be judged by the docking tool's ability to reproduce the binding mode of a ligand to protein when the structure of the ligand-protein complex is known. The criterion usually used is the all-atom RMSD between the docked position and the crystallographically observed binding position of the ligand, and success is typically regarded as being less than 2 A. ˚ When docking using stochastic methods, it is recommended that the experiment be run at least 50 times with different initial conditions. The similarity of the predicted binding modes can be assessed by computing a matrix of pairwise RMSD values and clustering docked conformations according to an RMSD threshold, typically 2 A. If all of the dockings cluster into one family, this indicates that the search ˚ parameters were sufficient for each Docking to converge. If there is no clustering at all, then the dockings should be repeated but with increased sampling: either increasing the number of iterations per search, increasing the number of searches, or, if the method is population-based, increasing the population size. If the scoring function were perfect, the docked conformation with the lowest energy would always correspond to the crystallographically observed binding mode, assuming no bad contacts in the crystal structure. This is not always the case, and sometimes a different binding mode is observed significantly more often than the lowest energy binding mode.

There has been growing interest in developing methods to distinguish binders from nonbinders. One of the earliest reports that used Docking to successfully discriminate binders from nonbinders considered a simple metric that combined the mean binding energy for all of the conformations in the cluster and the total number of conformationally distinct clusters found out of 100 dockings. The more clusters and the weaker the mean energy, the less likely the ligand was to bind. Furthermore, current docking methods tend to find the binding mode with the lowest possible interaction energy for a given ligand. This score does not necessarily indicate whether the ligand even binds. By building on statistical mechanical foundations, new methods are emerging that estimate the contributions of translational and rotational entropy to binding affinity by approximating the configurational entropy using the sizes of the clusters.

Overall, molecular Docking became an inevitable part of the drug designing phase. This method is based on an algorithm. The process is generally employed to explore the best fitting pose and scoring, unveil molecules' compactness from an ensemble, and gradually sort them by assigning ranks. It is based on virtual screening, which uses molecular descriptors and physicochemical properties of (in)active ligands. It has great usefulness in finding hits and leads through library enrichment for screening. Notable is also the fact that when molecular Docking is used as the final stage in virtual screening helps to provide a three-dimensional (3D) structural hypothesis of how a ligand interacts with its target (DNA, RNA, or protein). The prediction of novel ligands for more than 50 targets using molecular Docking has also been mentioned during the last decade.

  • 1. Sheweita SA. Drug-metabolizing enzymes: mechanisms and functions. Curr Drug Metab. 2000;1(2):107-132. .

    2. Reynald, R Leila et al. “Structural characterization of human cytochrome P450 2C19: active site differences between P450s 2C8, 2C9, and 2C19.” The Journal of biological chemistry vol. 287,53 (2012): 44581-91.

    3. Suzuki H, Kneller MB, Rock DA, Jones JP, Trager WF, Rettie AE. Active-site characteristics of CYP2C19 and CYP2C9 probed with hydantoin and barbiturate inhibitors. Arch Biochem Biophys. 2004;429(1):1-15.

    4. Foti RS, Wahlstrom JL. CYP2C19 inhibition: the impact of substrate probe selection on in vitro inhibition profiles. Drug Metab Dispos. 2008;36(3):523-528.

    5. Hutzler JM, Walker GS, Wienkers LC. Inhibition of cytochrome P450 2D6: structure-activity studies using a series of quinidine and quinine analogues. Chem Res Toxicol. 2003;16(4):450-459.

    6. VandenBrink, B. M., Foti, R. S., Rock, D. A., Wienkers, L. C., & Wahlstrom, J. L. (2011). Prediction of CYP2D6 Drug Interactions from In Vitro Data: Evidence for Substrate-Dependent Inhibition. Drug Metabolism and Disposition, 40(1), 47–53.

    7. Saito T, Gutiérrez Rico EM, Kikuchi A, et al. Functional characterization of 50 CYP2D6 allelic variants by assessing primaquine 5-hydroxylation. Drug Metab Pharmacokinet. 2018;33(6):250-257.

    8. Gutiérrez Rico EM, Kikuchi A, Saito T, et al. CYP2D6 genotyping analysis and functional characterization of novel allelic variants in a Ni-Vanuatu and Kenyan population by assessing dextromethorphan O-demethylation activity. Drug Metab Pharmacokinet. 2020;35(1):89-101.

    9. Marcel J. de Groot, Florian Wakenhut, Gavin Whitlock, Ruth Hyland, Understanding CYP2D6 interactions, Drug Discovery Today, Volume 14, Issues 19–20, 2009, Pages 964-972, ISSN 1359-6446.

    10. Hiroaki Edamatsu, Masataka Yagawa, Shinichi Ikushiro, Toshiyuki Sakaki, Yoshiaki Nakagawa, Hisashi Miyagawa, Miki Akamatsu, Identification and in silico prediction of metabolites of tebufenozide derivatives by major human cytochrome P450 isoforms, Bioorganic & Medicinal Chemistry, Volume 28, Issue 9, 2020, 115429, ISSN 0968-0896.

    11. Maréchal JD, Kemp CA, Roberts GC, Paine MJ, Wolf CR, Sutcliffe MJ. Insights into drug metabolism by cytochromes P450 from modelling studies of CYP2D6-drug interactions. Br J Pharmacol. 2008;153 Suppl 1(Suppl 1):S82-S89.

    12. de Graaf C, Oostenbrink C, Keizers PH, et al. Molecular modeling-guided site-directed mutagenesis of cytochrome P450 2D6. Curr Drug Metab. 2007;8(1):59-77.

    13. Guengerich FP, Hanna IH, Martin MV, Gillam EM. Role of glutamic acid 216 in cytochrome P450 2D6 substrate binding and catalysis. Biochemistry. 2003;42(5):1245-1253.

    14. Handa K, Nakagome I, Yamaotsu N, Gouda H, Hirono S. In silico study on the inhibitory interaction of drugs with wild-type CYP2D6.1 and the natural variant CYP2D6.17. Drug Metab Pharmacokinet. 2014;29(1):52-60.

    15. Mustafa G, Nandekar PP, Bruce NJ, Wade RC. Differing Membrane Interactions of Two Highly Similar Drug-Metabolizing Cytochrome P450 Isoforms: CYP 2C9 and CYP 2C19. Int J Mol Sci. 2019;20(18):4328. Published 2019 Sep 4.

    16. Li J, Cai J, Su H, et al. Effects of protein flexibility and active site water molecules on the prediction of sites of metabolism for cytochrome P450 2C19 substrates. Mol Biosyst.