GenomeThreader Gene Prediction Software

GenomeThreader is a software tool to compute gene structure predictions. The gene structure predictions are calculated using a similarity-based approach where additional cDNA/EST and/or protein sequences are used to predict gene structures via spliced alignments. GenomeThreader was motivated by disabling limitations in GeneSeqer, a popular gene prediction program which is widely used for plant genome annotation.


References have been omitted for brevity; you can find them and more details on the implementation in the GenomeThreader paper. How to take advantage of these features and many more is described in depth in the GenomeThreader manual. Please consult the FAQ page for frequently asked questions. All mentioned files and scripts are also part of the GenomeThreader distribution (see below).


GenomeThreader is available free of charge. You can download a copy.



The following sites use GenomeThreader. This list is not intended to be comprehensive.


Here are the most important publications citing GenomeThreader (sorted by Journal)

  1. Wang et. al. The genome sequence of African rice (Oryza glaberrima) and evidence for independent domestication, Nature Genetics 46:982-988, 2014.
  2. Argout et. al. The genome of Theobroma cacao, Nature Genetics 43:101-108, 2011.
  3. The Tomato Genome Consortium The tomato genome sequence provides insights into fleshy fruit evolution, Nature 485:635-641, 2012.
  4. The International Barley Genome Sequencing Consortium A physical, genetic and functional sequence assembly of the barley genome, Nature 491:711-716, 2012.
  5. J.M. Cock et. al. The Ectocarpus genome and the independent evolution of multicellularity in brown algae, Nature 465:617-621, 2010.
  6. The International Brachypodium Initiative Genome sequencing and analysis of the model grass Brachypodium distachyon, Nature 463:763-768, 2010.
  7. R. Wang et. al. PEP1 regulates perennial flowering in Arabis alpina, Nature 459:423-427, 2009.
  8. A.H. Paterson et. al. The Sorghum bicolor genome and the diversification of grasses, Nature 457:551-556, 2009.
  9. P. Abad et. al. Genome sequence of the metazoan plant-parasitic nematode Meloidogyne incognita, Nature Biotechnology 26:909-915, 2008.
  10. Wang et. al. The Spirodela polyrhiza genome reveals insights into its neotenous reduction fast growth and aquatic lifestyle, Nature Communications 5 Article number: 3311, 2014.
  11. The International Wheat Genome Sequencing Consortium (IWGSC) A chromosome-based draft sequence of the hexaploid bread wheat (Triticum aestivum) genome, Science 345(6194), 2014.
  12. Pfeifer et. al. Genome interplay in the grain transcriptome of hexaploid bread wheat, Science 345(6194), 2014.
  13. R. Bruggmann et. al. Uneven chromosome contraction and expansion in the maize genome, Genome Research 16:1241-1251, 2006.
  14. Moreau et. al. Gene functionalities and genome structure in Bathycoccus prasinos reflect cellular specializations at the base of the green lineage, Genome Biology 13(8):R74, 2012.
  15. Duvick et. al. PlantGDB: a resource for comparative plant genomics, Nucl. Acids Res. 36:D959-D965, 2008.
  16. Nijkamp et. al. Exploring variation-aware contig graphs for (comparative) metagenomics using MaryGold, Bioinformatics 29(22):2826-2834, 2013.
  17. Montalent et. al. EuGène-maize: a web site for maize gene prediction, Bioinformatics 26(9):1254-1255, 2010.
  18. Wang et. al. Identification and Dissection of Four Major QTL Affecting Milk Fat Content in the German Holstein-Friesian Population, PLOS one 7(7):e40711, 2012.
  19. Petre et. al. RNA-Seq of Early-Infected Poplar Leaves by the Rust Pathogen Melampsora larici-populina Uncovers PtSultr3;5, a Fungal-Induced Host Sulfate Transporter, PLOS one 7(8):e44408, 2012.
  20. Grenville-Briggs et. al. A Molecular Insight into Algal-Oomycete Warfare: cDNA Analysis of Ectocarpus siliculosus Infected with the Basal Oomycete Eurychasma dicksonii, PLOS one 6(9):e24500, 2011.
  21. Di Filippo et. al. Euchromatic and heterochromatic compositional properties emerging from the analysis of Solanum lycopersicum BAC sequences, Gene 499(1):176-181, 2012.
  22. Pausch et. al. Genome-Wide Association Study Identifies Two Major Loci Affecting Calving Ease and Growth-Related Traits in Cattle, Genetics 187(1):289-297, 2011.
  23. Martin et. al. A uniquely high number of ftsZ genes in the moss Physcomitrella patens, Plant Biology 11(5):744-750, 2009.
  24. Richardt et. al. Microarray analysis of the moss Physcomitrella patens reveals evolutionarily conserved transcriptional regulation of salt stress and abscisic acid signalling, Plant Molecular Biology 72(1):27-45, 2010.
  25. De Palma et. al. Suppression Subtractive Hybridization analysis provides new insights into the tomato (Solanum lycopersicum L.) response to the plant probiotic microorganism Trichoderma longibrachiatum MK1, Journal of Plant Physiology 190:79-94, 2016.
  26. van der Burgt et. al. Pseudogenization in pathogenic fungi with different host plants and lifestyles might reflect their evolutionary past, Molecular Plant Pathology 15(2):133-144, 2014.
  27. M. Calviño, R. Bruggmann and J. Messing Screen of genes linked to high-sugar content in stems by comparative genomics, Rice 1(2):166-176, 2008.
  28. Lin et. al. Structural and Functional Divergence of a 1-Mb Duplicated Region in the Soybean (Glycine max) Genome and Comparison to an Orthologous Region from Phaseolus vulgaris, The Plant Cell 22(8):2545-2561, 2010.
  29. Lelandais-Briere et. al. Genome-Wide Medicago truncatula Small RNA Analysis Revealed Novel MicroRNAs and Isoforms Differentially Regulated in Roots and Nodules, The Plant Cell 21(9):2780-2896, 2009.
  30. Van de Velde et. al. Inference of Transcriptional Networks in Arabidopsis through Conserved Noncoding Sequence Analysis, The Plant Cell 26(7):2729-2745, 2009.
  31. Schallau et. al. Identification and genetic analysis of the APOSPORY locus in Hypericum perforatum L, The Plant Journal 62(5):773-784, 2010.
  32. Tang et. al. Unleashing the Genome of Brassica Rapa, Front Plant Sci. 3:172, 2012.
  33. Castagnone-Sereno et. al. Data-mining of the Meloidogyne incognita degradome and comparative analysis of proteases in nematodes, Genomics 97(1):29-36, 2011.
  34. Pausch et. al. Homozygous haplotype deficiency reveals deleterious mutations compromising reproductive and rearing success in cattle, BMC Genomics 16:312, 2015.
  35. Jung et. al. A nonsense mutation in PLD4 is associated with a zinc deficiency-like syndrome in Fleckvieh cattle, BMC Genomics 15:632, 2014.
  36. Ercolano et. al. Patchwork sequencing of tomato San Marzano and Vesuviano varieties highlights genome-wide variations, BMC Genomics 15:138, 2014.
  37. Venhoranta et. al. In frame exon skipping in UBE3B is associated with developmental disorders and increased mortality in cattle, BMC Genomics 15:1, 2014.
  38. Zimmer et. al. Reannotation and extended community resources for the genome of the non-seed plant Physcomitrella patens provide insights into the evolution of plant gene structures and functions, BMC Genomics 14:498, 2013.
  39. Jansen et. al. Assessment of the genomic variation in a cattle population by re-sequencing of key animals at low to medium coverage, BMC Genomics 14:446, 2013.
  40. Schiffer et. al. The genome of Romanomermis culicivorax: revealing fundamental changes in the core developmental genetic toolkit in Nematoda, BMC Genomics 14:923, 2013.
  41. Duo et. al. Mitochondrial genome evolution in species belonging to the Phialocephala fortinii s.l. - Acephala applanata species complex, BMC Genomics 13:166, 2012.
  42. Steuernagel et. al. De novo 454 sequencing of barcoded BAC pools for comprehensive gene survey and genome analysis in the complex genome of barley, BMC Genomics 10:547, 2009.
  43. Mondego et. al. A genome survey of Moniliophthora perniciosa gives new insights into Witches' Broom Disease of cacao, BMC Genomics 9:548, 2008.
  44. A. Ballvora et. al. Comparative sequence analysis of Solanum and Arabidopsis in a hot spot for pathogen resistance on potato chromosome V reveals a patchwork of conserved and rapidly evolving genome segments, BMC Genomics 8:112, 2007.
  45. Iorizzo et. al. A DArT marker-based linkage map for wild potato Solanum bulbocastanum facilitates structural comparisons between Solanum A and B genomes, BMC Genetics 15:123, 2014.
  46. Licciardello et. al. Characterization of the glutathione S-transferase gene family through ESTs and expression analyses within common and pigmented cultivars of Citrus sinensis (L.) Osbeck, BMC Plant Biology 14:39, 2014.
  47. Sinha et. al. Identification and characterization of NAGNAG alternative splicing in the moss Physcomitrella patens, BMC Plant Biology 10:76, 2010.
  48. Bazzini et. al. miSolRNA: A tomato micro RNA relational database, BMC Plant Biology 10:240, 2010.
  49. D'Agostino et. al. SolEST database: a "one-stop shop" approach to the study of Solanaceae transcriptomes, BMC Plant Biology 9:142, 2009.
  50. M.E. Sparks and V. Brendel MetWAMer: eukaryotic translation initiation site prediction, BMC Bioinformatics 9:381, 2008.
  51. Chiusano et. al. ISOL@: an Italian SOLAnaceae genomics resource, BMC Bioinformatics 9(2):57, 2008.
  52. Q. Dong, M.D. Wilkerson and V. Brendel Tracembler - software for in-silico chromosome walking in unassembled genomes, BMC Bioinformatics 8:151, 2007.
  53. Flisikowski et. al. Variation in neighbouring genes of the dopaminergic and serotonergic systems affects feather pecking behaviour of laying hens, Animal Genetics 40(2):192-199, 2009.
  54. Juling et. al. Characterization of a 320-kb region containing the HEXA gene on bovine chromosome 10 and analysis of its association with BSE susceptibility, Animal Genetics 39(4):400-406, 2008.
  55. Foissac et. al. Genome Annotation in Plants and Fungi: EuGèene as a Model Platform, Current Bioinformatics 3(2), 2008.
  56. Sen et. al. MaizeGDB becomes 'sequence-centric', Database--the journal of biological databases and curation 2009 bap020, 2009.
  57. Nijkamp et. al. De novo sequencing, assembly and analysis of the genome of the laboratory strain Saccharomyces cerevisiae CEN.PK113-7D, a model for modern industrial biotechnology, Microbial Cell Factories 9:548, 2012.
  58. Asp et. al. Comparative sequence analysis of VRN1 alleles of Lolium perenne with the co-linear regions in barley, wheat, and rice, Molecular Genetics and Genomics 286(5):433-447, 2011.
  59. Cohen et. al. RAPPORT: running scientific high-performance computing applications on the cloud, Philos Trans A Math Phys Eng Sci. 371:20120073, 2013.
  60. Traini et. al. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences, International Journal of Genomics Article ID 257218, 2013.

If I missed a publication which cites GenomeThreader, please contact me.


GenomeThreader is being actively developed by the following individuals:


Please cite the following article in publications about research using GenomeThreader:

For in-depth information about GenomeThreader please refer to the following dissertation: