Perl API Installation
Introduction
All data sets in the Ensembl system are stored in relational databases (MySQL). For each of the Ensembl databases the project provides a specific Perl API. As Ensembl takes also advantage of code provided by the BioPerl project; installation of the BioPerl package is included in these instructions. The Ensembl API is compatible with Perl version 5.14 through to 5.26.
Video Tutorial
Ensembl has produced a video tutorial about how to install the API. Its content is based on this document so you can follow both resources when performing an installation. All commands in this video can be found from the following document on our FTP site.
Installation Procedure
There are two ways of installing the Perl API. You can clone it from GitHub using Git if you have that available, or you can download the files in gzipped TAR format from our FTP site. You will also need BioPerl 1.6.924 core modules (bioperl-live).
N.B. We recommend waiting until a few days after a release before downloading the new API (or re-downloading after a few days), as there may be post-release bug fixes added to the code.
-
Create an installation directory and download the distributions:
$ cd $ mkdir src $ cd src $ wget https://ftp.ensembl.org/pub/grch37/ensembl-api.tar.gz $ wget https://github.com/bioperl/bioperl-live/archive/release-1-6-924.zip
-
Unpack the downloaded files. In the Unix command line, type:
$ tar zxvf ensembl-api.tar.gz $ unzip release-1-6-924.zip
In Windows, you will need an unzipping utility such as 7-Zip.
-
Rename the bioperl-live directory. In the Unix command line, type:
$ mv bioperl-live-release-1-6-924 bioperl-1.6.924
In classic Windows command line, use ren instead of mv.
-
Set up your environment
You have to tell Perl where to find the modules you just installed. You can do this by using the use lib clause in your script but if you want to make these modules available for all your scripts, the best way is to add them into the PERL5LIB environment variable.
-
Under bash, ksh, or any sh-derived shell:
PERL5LIB=${PERL5LIB}:${HOME}/src/bioperl-1.6.924 PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl/modules PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl-compara/modules PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl-variation/modules PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl-funcgen/modules export PERL5LIB
-
Under csh or tcsh:
setenv PERL5LIB ${PERL5LIB}:${HOME}/src/bioperl-1.6.924 setenv PERL5LIB ${PERL5LIB}:${HOME}/src/ensembl/modules setenv PERL5LIB ${PERL5LIB}:${HOME}/src/ensembl-compara/modules setenv PERL5LIB ${PERL5LIB}:${HOME}/src/ensembl-variation/modules setenv PERL5LIB ${PERL5LIB}:${HOME}/src/ensembl-funcgen/modules
-
Under Windows (assuming you installed the APIs in C:\src\):
set PERL5LIB=C:\src\bioperl-1.6.924;C:\src\ensembl\modules;C:\src\ensembl-compara\modules;C:\src\ensembl-variation\modules;C:\src\ensembl-funcgen\modules
-
In Perl (we do not recommend creating hard-coded dependencies in Perl scripts):
use lib "$ENV{HOME}/src/bioperl-1.6.924"; use lib "$ENV{HOME}/src/ensembl/modules"; use lib "$ENV{HOME}/src/ensembl-compara/modules"; use lib "$ENV{HOME}/src/ensembl-variation/modules"; use lib "$ENV{HOME}/src/ensembl-funcgen/modules";
-
Variation genotype and frequency data
To retrieve genotype, frequency and linkage disequilibrium (LD) data for 1000 Genomes phase 3 variants, it is necessary to install a couple of extra dependencies:
-
Bio-DB-HTS and perl module:
cd ~/src git clone --branch master --depth 1 https://github.com/samtools/htslib.git cd htslib make export HTSLIB_DIR=${HOME}/src/htslib/ cd .. git clone https://github.com/Ensembl/Bio-DB-HTS.git cd Bio-DB-HTS perl Build.PL ./Build export PERL5LIB=$PERL5LIB:${HOME}/src/Bio-DB-HTS/lib:${HOME}/src/Bio-DB-HTS/blib/arch/auto/Bio/DB/HTS/:${HOME}/src/Bio-DB-HTS/blib/arch/auto/Bio/DB/HTS/Faidx cd .. cd ensembl-variation/C_code/ make cd ../../
Set up environment; use the path output from the "make && make install" command for the PERL5LIB variable, e.g.
PERL5LIB=${PERL5LIB}:${HOME}/src/lib/perl/5.14.4/ export PERL5LIB
-
ensembl-io perl modules (only if you didn't use Git Ensembl tools to install the API):
cd ~/src wget https://github.com/Ensembl/ensembl-io/archive/release/113.zip unzip 113.zip mv ensembl-io-release-113 ensembl-io
Add this to PERL5LIB.
-
Under bash, ksh, or any sh-derived shell:
PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl-io/modules export PERL5LIB
-
Under csh or tcsh:
setenv PERL5LIB ${PERL5LIB}:${HOME}/src/ensembl-io/modules
-
Non-vertebrates
If you are working with non vertebrate genomes, you will also need the ensembl-metadata modules (only if you didn't use Git Ensembl tools to install the API):
cd ~/src wget https://github.com/Ensembl/ensembl-metadata/archive/release/113.zip unzip 113.zip mv ensembl-metadata-release-113 ensembl-metadata
Add this to PERL5LIB.
-
Under bash, ksh, or any sh-derived shell:
PERL5LIB=${PERL5LIB}:${HOME}/src/ensembl-metadata/modules export PERL5LIB
-
Under csh or tcsh:
setenv PERL5LIB ${PERL5LIB}:${HOME}/src/ensembl-metadata/modules
Debugging an Installation
Sometimes installations can go wrong. You should follow our debugging installation guide to help diagnose and resolve installation issues.
Tips for Windows and Mac Users
Ensembl can be installed on both Windows and Mac machines however installation is not as straightforward as installing on Linux. We recommend you consult our two blog posts detailing how you can install Ensembl on Windows and on OSX. The fastest way to get up and running with Ensembl on these operating systems is to use our virtual machine.