Ensembl Tools Configuration

Introduction

Since release 75, we have been moving our online tools to a new modular system that can be installed on any external site, without any dependencies on proprietary software. These tools can be integrated into any Ensembl-powered site, and modified by project developers to use any backend architecture required.

Whilst we aim to make our documentation as accurate as we can, it is not possible to cover every contigency or variation in server setup. If you are still having problems after following these instructions carefully, please contact us and we will do our best to help.

System architecture

The basic components of the system are:

  • a ticket database, allowing retrieval and storage of jobs
  • a dispatcher and, for BLAST only (at present), an optional queueing system to run the jobs. The dispatcher and queueing system use Ensembl eHive by default, but could be replaced by other custom modules which use different technologies, such as the Ensembl REST API.
  • two plugins to the ensembl webcode:
    • public-plugins/tools, which provides the user interface and manages the ticketing system
    • public-plugins/tool_hive, which manages job submission to eHive

In addition, you will need the ensembl-orm git repository, which is a database access layer used by several web plugins, including user accounts. If you already have user accounts set up on your Ensembl mirror, they will automatically be available to the tools system, but it is not necessary to have accounts enabled in order to run Ensembl Tools.

Software installation

1. Download code

First, set up your Ensembl mirror site as per the main installation instructions.

Then, in the same directory as the rest of your Ensembl code, clone the ensembl-orm repo from GitHub, if you don't already have it. You will also need to install the Rose ORM suite, available from CPAN.

To use the eHive dispatcher you will also need to follow the eHive installation instructions.

2. Create data files directory

All of the Ensembl tools use some kind of index/cache files for fast data processing, and it makes sense to keep all of them in the same place for easy maintenance. Create this directory somewhere that the webserver can access, e.g. in the same directory as your ensembl git repositories. For example if you've used the default location given in the main installation instructions, you might put your data files in /usr/local/ensembl/tools_data/.

3. Configure

Configure the additional plugins in ensembl-webcode/conf/Plugins.pm as shown below:

  'EnsEMBL::Mirror'      => $SiteDefs::ENSEMBL_SERVERROOT.'/public-plugins/mirror',
  'EnsEMBL::Tools_hive'  => $SiteDefs::ENSEMBL_SERVERROOT.'/public-plugins/tools_hive',
  'EnsEMBL::Tools'       => $SiteDefs::ENSEMBL_SERVERROOT.'/public-plugins/tools',
  'EnsEMBL::Ensembl'     => $SiteDefs::ENSEMBL_SERVERROOT.'/public-plugins/ensembl',
  'EnsEMBL::Docs'        => $SiteDefs::ENSEMBL_SERVERROOT.'/public-plugins/docs'

Configure connections to the database server(s) where you wish to host your tools and ehive databases, in public-plugins/mirror/conf/ini-files/MULTI.ini:

[databases]

[DATABASE_WEB_TOOLS]
HOST = myhost
PORT = 3306
USER = myuser

[DATABASE_WEB_HIVE]
HOST = myhost
PORT = 3306
USER = myuser

Note that you don't need to configure database names, as default names are provided in the plugins.

4. Restart your mirror website

In order to pick up the plugins and compile any required database settings, JavaScript, etc into your site, you will need to run ctrl-scripts/restart with the -r flag.

Note: the code will complain that it can't find your ensembl_web_tools database, but that's OK - ignore the warning and continue with installation.

5. Create your tools database

From the tools plugin, run utils/create_tools_db.pl to set up the ensembl_web_tools MySQL database.

6. Create your eHive database

From the tools_hive plugin, run utils/init_pipeline.pl to set up the ensembl_web_hive MySQL database.

7. Set up individual tools

Each tool can be added to the interface separately, so for example you don't need to set up BLAST if you only need the VEP. However all tools are enabled by default, so if you don't need a tool, you should disable it in public-plugins/mirror/SiteDefs.pm. For example:

  $SiteDefs::ENSEMBL_BLAST_ENABLED  = 0;
  $SiteDefs::ENSEMBL_VEP_ENABLED    = 1;
  $SiteDefs::ENSEMBL_AC_ENABLED     = 0; # Assembly Converter
  $SiteDefs::ENSEMBL_IDM_ENABLED    = 0; # ID History Converter

For each enabled tool, you will need to do additional set-up - see the table below for links to the instructions.

The following tools have been ported to the new interface:

Tool
BLAST/BLAT Setup instructions User guide
VEP Setup instructions User guide
Assembly Converter Setup instructions
ID History Converter No additional setup required

8. Start the beekeeper process

Go to public-plugins/tools_hive/utils and run the script as follows:

perl beekeeper_manager.pl --keep_alive --sleep=0.5