Entamoeba histolytica Assembly and Gene Annotation
About the Entamoeba histolytica genome
Entamoeba histolytica is an anaerobic parasitic protozoan, part of the genus Entamoeba. Predominantly infecting humans and other primates, E. histolytica is estimated to infect about 50 million people worldwide. Entamoeba histolytica (strain HM-1:IMSS), was the first human amoeba to have its genome sequenced, assembled and analyzed. Whole genome shotgun sequencing with small insert libraries was used to carry out sequencing to a 12.5-fold coverage.
Assembly
Assembly is imported from AmoebaDB, it is also available from the EMBL/Genbank/DDBJ databases under the accession GCA_000208925.2.
Annotation
The data in this site is a direct import from AmoebaDB, The data used in this release corresponds to the release 1.7 from 11-Jan-2012 and can be downloaded from here. Publications using this data should cite the following publication:
References
- New assembly, reannotation and analysis of the Entamoeba
histolytica genome reveal new genomic features and protein content
information.
Lorenzi HA, Puiu D, Miller JR, Brinkac LM, Amedeo P, Hall N, Caler EV. 2010. PLoS Negl Trop Dis. 4:e716. - The genome of the protist parasite Entamoeba
histolytica.
Loftus B, Anderson I, Davies R, Alsmark UC, Samuelson J, Amedeo P, Roncaglia P, Berriman M, Hirt RP, Mann BJ et al. 2005. Nature. 433:865-868. - EuPathDB: a portal to eukaryotic pathogen
databases.
Aurrecoechea C, Brestelli J, Brunk BP, Fischer S, Gajria B, Gao X, Gingle A, Grant G, Harb OS, Heiges M et al. 2010. Nucleic Acids Res.. 38:D415-9.
Picture credit: Dr. George Healy
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | JCVI-ESG2-1.0, INSDC Assembly GCA_000208925.2, Jun 2010 |
Database version | 113.1 |
Golden Path Length | 20,835,395 |
Genebuild by | AmoebaDB |
Genebuild method | Import |
Data source | AmoebaDB |
Gene counts
Coding genes | 8,283 |
Non coding genes | 97 |
Small non coding genes | 90 |
Long non coding genes | 7 |
Pseudogenes | 20 |
Gene transcripts | 8,400 |