Team:TAU Israel/Contribution

TAU_Israel's Header

Our animated logo is keeping you company until the page has loaded.

Layout for Contribution page

Future Contribution

Software Contribution

Our software aims to tailor the expression of genetic information to any microbial population, producing a regulated gene ready for transformation along with relevant documentation and predictions, presented in a user-friendly manner.  Corresponding to the flexibility of metagenomic data, our software applies to communities with various degrees of characterization, from mere annotated genomes to documented gene expression levels, along with optional advanced features which are fully customizable. 

The flexibility of our optimizations and ability to provide a working framework for non-model organisms will allow future iGEM teams to apply this tool in many different settings. 

Motif Algorithms Table

For the promoter module in our integrative model, we conducted an extensive scientific literary review into different algorithms for motif discovery and comparison.

We scanned those algorithms (most of them reviewed in Das and Dai [1]) according to a set of requirements we made in order to find the most suitable method for our needs, and put it into a table to help organize the results.

It was helpful to us (SPOILER: we chose STREME [11]), and we'd like to share it here so future iGEM teams, with their own sets of criteria and their own goals to achieve, could save time and effort and have easy access to anything motif-related.

Click here to download the spreadsheet.


  1. M. K. Das and H.-K. Dai, “A survey of dna motif finding algorithms,” BMC Bioinformatics, vol. 8, no. S7, 2007.
  2. J. van Helden, B. André, and J. Collado-Vides, “Extracting regulatory sites from the UPSTREAM region of yeast genes by computational analysis of Oligonucleotide frequencies,” Journal of Molecular Biology, vol. 281, no. 5, pp. 827–842, 1998.
  3. M. Tompa, “An Exact Method for Finding Short Motifs in Sequences, with Application to the Ribosome Binding Site Problem,” Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology, pp. 262–271, 1999.
  4. S. Sinha and M. Tompa, “A statistical method for finding transcription factor binding site,” Proceedings. International Conference on Intelligent Systems for Molecular Biology, vol. 8, pp. 344–354, Feb. 2000.
  5. M.-F. Sagot, “Spelling approximate repeated or common motifs using a suffix tree,” LATIN'98: Theoretical Informatics, pp. 374–390, 1998.
  6. A. Vanet, L. Marsan, A. Labigne, and M.-F. Sagot, “Inferring regulatory elements from a whole genome. an analysis of helicobacter PYLORIΣ80 family of promoter signals,” Journal of Molecular Biology, vol. 297, no. 2, pp. 335–353, 2000.
  7. G. Z. Hertz, G. W. Hartzell, and G. D. Stormo, “Identification of consensus patterns in unaligned DNA sequences known to be functionally related,” Bioinformatics, vol. 6, no. 2, pp. 81–92, 1990.
  8. T. A. Down and T. J. P. Hubbard, “Nestedmica: Sensitive inference of over-represented motifs in Nucleic acid sequence,” Nucleic Acids Research, vol. 33, no. 5, pp. 1445–1453, 2005.
  9. C. E. Lawrence and A. A. Reilly, “An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences,” Proteins: Structure, Function, and Genetics, vol. 7, no. 1, pp. 41–51, 1990.
  10. T. L. Bailey and C. Elkan, “Unsupervised learning of multiple motifs in biopolymers using expectation maximization,” Machine Learning, vol. 21, no. 1-2, pp. 51–80, 1995.
  11. T. L. Bailey, “Streme: Accurate and versatile sequence motif discovery,” Bioinformatics, vol. 37, no. 18, pp. 2834–2840, 2021.
  12. C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and J. Wootton, “Detecting subtle sequence signals: A gibbs sampling strategy for multiple alignment,” Science, vol. 262, no. 5131, pp. 208–214, 1993.
  13. F. P. Roth, J. D. Hughes, P. W. Estep, and G. M. Church, “Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mrna quantitation,” Nature Biotechnology, vol. 16, no. 10, pp. 939–945, 1998.
  14. G. Thijs, K. Marchal, M. Lescot, S. Rombauts, B. De Moor, P. Rouzé, and Y. Moreau, “A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes,” Journal of Computational Biology, vol. 9, no. 2, pp. 447–464, 2002.
  15. X. Liu, D. L. Brutlag, and J. S. Liu, “BioProspector: Discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes,” Biocomputing 2001, 2000.
  16. K. Shida, “Gibbsst: A Gibbs sampling method for motif discovery with enhanced resistance to local Optima,” BMC Bioinformatics, vol. 7, no. 1, 2006.
  17. F. F. M. Liu, J. J. P. Tsai, R. M. Chen, S. N. Chen, and S. H. Shih, “FMGA: Finding motifs by genetic algorithm,” Proceedings. Fourth IEEE Symposium on Bioinformatics and Bioengineering, Jan. 2004.
  18. D. Liu, X. Xiong, B. Dasgupta, and H. Zhang, “Motif discoveries in unaligned molecular sequences using self-organizing neural networks,” IEEE Transactions on Neural Networks, vol. 17, no. 4, pp. 919–928, 2006.
  19. C. Kingsford, E. Zaslavsky, and M. Singh, “A compact mathematical programming formulation for DNA motif finding,” Combinatorial Pattern Matching, pp. 233–245, 2006.
  20. T. Kaplan, N. Friedman, and H. Margalit, “Ab initio prediction of transcription factor targets using structural knowledge,” PLoS Computational Biology, vol. 1, no. 1, 2005.
  21. J. Hu, B. Li, and D. Kihara, “Limitations and potentials of current motif discovery algorithms,” Nucleic Acids Research, vol. 33, no. 15, pp. 4899–4913, 2005.
  22. J. Hu, Y. D. Yang, and D. Kihara, “EMD: An ensemble algorithm for discovering regulatory motifs in DNA sequences,” BMC Bioinformatics, vol. 7, no. 1, 2006.
  23. X. S. Liu, D. L. Brutlag, and J. S. Liu, “An algorithm for finding protein–DNA binding sites with applications to chromatin- immunoprecipitation microarray experiments,” Nature Biotechnology, vol. 20, no. 8, pp. 835–839, 2002.
  24. M. Blanchette and M. Tompa, “Discovery of regulatory elements by a computational method for phylogenetic footprinting,” Genome Research, vol. 12, no. 5, pp. 739–748, 2002.
  25. E. Berezikov, V. Guryev, R. H. A. Plasterk, and E. Cuppen, “CONREAL: Conserved regulatory elements anchored alignment algorithm for identification of transcription factor binding sites by phylogenetic footprinting,” Genome Research, vol. 14, no. 1, pp. 170–178, 2003.
  26. T. Wang and G. D. Stormo, “Identifying the conserved network of cis-regulatory sites of a eukaryotic genome,” Proceedings of the National Academy of Sciences, vol. 102, no. 48, pp. 17400–17405, 2005.
  27. C. S. Carmack, L. A. McCue, L. A. Newberg, and C. E. Lawrence, “PhyloScan: Identification of transcription factor binding sites using cross-species evidence,” Algorithms for Molecular Biology, vol. 2, no. 1, 2007.
  28. A. Prakash, M. Blanchette, S. Sinha, and M. Tompa, “Motif discovery in heterogeneous sequence data,” Biocomputing 2004, 2003.
  29. T. Wang and G. D. Stormo, “Combining phylogenetic data with co-regulated genes to identify regulatory motifs,” Bioinformatics, vol. 19, no. 18, pp. 2369–2380, 2003.
  30. S. Sinha, M. Blanchette, and M. Tompa, “PhyME: A probabilistic algorithm for finding motifs in sets of orthologous sequences,” BMC Bioinformatics, vol. 5, no. 1, p. 170, Oct. 2004.
  31. A. M. MOSES, D. Y. CHIANG, and M. B. EISEN, “Phylogenetic motif detection by expectation-maximization on evolutionary mixtures,” Biocomputing 2004, 2004.
  32. R. Siddharthan, E. D. Siggia, and E. van Nimwegen, “PhyloGibbs: A Gibbs sampling motif finder that incorporates phylogeny,” PLoS Computational Biology, vol. 1, no. 7, 2005.
  33. P. Arnold, I. Erb, M. Pachkov, N. Molina, and E. van Nimwegen, “Motevo: Integrated Bayesian probabilistic methods for inferring regulatory sites and motifs on multiple alignments of DNA sequences,” Bioinformatics, vol. 28, no. 4, pp. 487–494, 2011.
  34. S. Heinz, C. Benner, N. Spann, E. Bertolino, Y. C. Lin, P. Laslo, J. X. Cheng, C. Murre, H. Singh, and C. K. Glass, “Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities,” Molecular Cell, vol. 38, no. 4, pp. 576–589, 2010.
  35. V. Solovyev and A. Salamov, “Automatic Annotation of Microbial Genomes and Metagenomic Sequences,” Metagenomics and its Applications in Agriculture, Biomedicine and Environmental Studies, pp. 61–78, Jan. 2011.