Entamoeba histolytica (JCVI-ESG2-1.0)

Entamoeba histolytica Assembly and Gene Annotation

About the Entamoeba histolytica genome

Entamoeba histolytica is an anaerobic parasitic protozoan, part of the genus Entamoeba. Predominantly infecting humans and other primates, E. histolytica is estimated to infect about 50 million people worldwide. Entamoeba histolytica (strain HM-1:IMSS), was the first human amoeba to have its genome sequenced, assembled and analyzed. Whole genome shotgun sequencing with small insert libraries was used to carry out sequencing to a 12.5-fold coverage.


Assembly is imported from AmoebaDB, it is also available from the EMBL/Genbank/DDBJ databases under the accession GCA_000208925.2.


The data in this site is a direct import from AmoebaDB, The data used in this release corresponds to the release 1.7 from 11-Jan-2012 and can be downloaded from here. Publications using this data should cite the following publication:

EuPathDB: a portal to eukaryotic pathogen databases. Nucleic Acids Res. 2010 Jan;38(Database issue):D415-9. Epub 2009 Nov 13


  1. New assembly, reannotation and analysis of the Entamoeba histolytica genome reveal new genomic features and protein content information.
    Lorenzi HA, Puiu D, Miller JR, Brinkac LM, Amedeo P, Hall N, Caler EV. 2010. PLoS Negl Trop Dis. 4:e716.
  2. The genome of the protist parasite Entamoeba histolytica.
    Loftus B, Anderson I, Davies R, Alsmark UC, Samuelson J, Amedeo P, Roncaglia P, Berriman M, Hirt RP, Mann BJ et al. 2005. Nature. 433:865-868.
  3. EuPathDB: a portal to eukaryotic pathogen databases.
    Aurrecoechea C, Brestelli J, Brunk BP, Fischer S, Gajria B, Gao X, Gingle A, Grant G, Harb OS, Heiges M et al. 2010. Nucleic Acids Res.. 38:D415-9.

Picture credit: Dr. George Healy

More information

General information about this species can be found in Wikipedia.



AssemblyJCVI-ESG2-1.0, INSDC Assembly GCA_000208925.2, Jun 2010
Database version111.1
Golden Path Length20,835,395
Genebuild byAmoebaDB
Genebuild methodImport
Data sourceAmoebaDB

Gene counts

Coding genes8,283
Non coding genes97
Small non coding genes90
Long non coding genes7
Gene transcripts8,400