Upcoming Ensembl Platform Transition

This is the final release of its kind on this website.

In summer 2026, this site will bring you to the new Ensembl platform currently at beta.ensembl.org.
Please bookmark this archive to retain access to the current site, tools and functionality until they are available on the new platform -> eg63-protists.ensembl.org

Gene families

Gene families are sets of proteins that have been clustered based on sequence similarity. They are displayed in the web interface or can be accessed using the Ensembl Compara Perl API.

In Ensembl Bacteria, Fungi and Protists, gene families provide a way of exploring similar proteins across a wide range of bacterial genomes for which the standard peptide comparative pipeline cannot be run. They are populated with proteins on all genomes by using the HAMAP and PANTHER classification provided by InterPro. This InterPro gene families pipeline uses the same database schema and API as the Ensembl Vertebrates gene family pipeline, but the families generated by these two pipelines are generated using different methods, and should not be interpreted as equivalent.

InterPro gene families are not available for other Ensembl Genomes divisions such as Plants or Metazoa. However, in exceptional circumstances, if a protein cluster identified by the gene-tree building process is too large to process completely within the constraints of the release cycle, it may be made available via the gene families web view.