Variant Effect Predictor Plugins


VEP can use plugin modules written in Perl to add functionality to the software.

Plugins are a powerful way to extend, filter and manipulate the VEP output.
They can be installed using VEP's installer script, run the following command to get a list of available plugins:

perl INSTALL.pl -a p -g list

Alternatively, VEP plugins and their dependencies are available in the Docker image. Read how to use Ensembl VEP in Docker and Singularity.

Some plugins are also available to use via the VEP web and REST interfaces.


Existing plugins

We have written several plugins that implement experimental functionalities that we do not (yet) include in the variation API, and these are stored in a public github repository:

https://github.com/Ensembl/VEP_plugins

Here is the list of the VEP plugins available:

Select categories:
Plugin Description Category External libraries Developer

This plugin for the Ensembl Variant Effect Predictor (VEP) annotates missense variants with the
pre-computed AlphaMissense pathogenicity scores. AlphaMissense is a deep learning model developed
by Google DeepMind that predicts the pathogenicity of single nucleotide missense variants. more

Pathogenicity predictions
-Ensembl

A VEP plugin that retrieves ancestral allele sequences from a FASTA file. more

Conservation
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
adds the BayesDel scores to VEP output. more

Pathogenicity predictions
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
looks up the BLOSUM 62 substitution matrix score for the reference
and alternative amino acids predicted for a missense mutation. It adds
one new entry to the VEP's Extra column, BLOSUM62 which is the
associated score.

Conservation
-Ensembl
Combined Annotation Dependent Depletion

A VEP plugin that retrieves CADD scores for variants from one or more
tabix-indexed CADD data files. more

Pathogenicity predictions
-Ensembl

A VEP plugin that retrieves CAPICE scores for variants from one or more
tabix-indexed CAPICE data files, in order to predict their pathogenicity. more

Pathogenicity predictions
-Ensembl

A VEP plugin that calculates the Combined Annotation scoRing toOL (CAROL)
score (1) for a missense mutation based on the pre-calculated SIFT (2) and
PolyPhen-2 (3) scores from the Ensembl API (4). more

Pathogenicity predictions
Math::CDF qw(pnorm qnorm)Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds pre-calculated scores from ClinPred.
ClinPred is a prediction tool to identify disease-relevant nonsynonymous variants. more

Pathogenicity predictions
-Ensembl

A VEP plugin that calculates the Consensus Deleteriousness (Condel) score (1)
for a missense mutation based on the pre-calculated SIFT (2) and PolyPhen-2 (3)
scores from the Ensembl API (4). more

Pathogenicity predictions
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
retrieves a conservation score from the Ensembl Compara databases
for variant positions. You can specify the method link type and
species sets as command line options, the default is to fetch GERP
scores from the EPO 35 way mammalian alignment (please refer to the
Compara documentation for more details of available analyses). more

Conservation
Net::FTPEnsembl

A VEP plugin that retrieves data for missense variants from a tabix-indexed
dbNSFP file. more

Pathogenicity predictions
File::Basename qw(basename)Ensembl

A VEP plugin that retrieves data for splicing variants from a tabix-indexed
dbscSNV file. more

Splicing predictions
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
adds Variant-Disease-PMID associations from the DisGeNET database.
It is available for GRCh38. more

Phenotype data and citations
List::MoreUtils qw(uniq)Ensembl

A VEP plugin that retrieves haploinsufficiency and triplosensitivity probability scores
for affected genes from a dosage sensitivity catalogue published in paper -
https://www.sciencedirect.com/science/article/pii/S0092867422007887 more

Functional effect
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
predicts the downstream effects of a frameshift variant on the protein
sequence of a transcript. It provides the predicted downstream protein
sequence (including any amino acids overlapped by the variant itself),
and the change in length relative to the reference protein. more

Nearby features
-Ensembl

A VEP plugin that draws pictures of the transcript model showing the
variant location. Can take five optional paramters: more

Visualisation
Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds pre-calculated Enformer predictions of variant impact on chromatin and gene expression. more

Regulatory impact
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
adds information from EVE (evolutionary model of variant effect).
This plugin only report EVE scores for input variants
and does not merge input lines to report on adjacent variants.
It is only available for GRCh38. more

Pathogenicity predictions
-Ensembl

A VEP plugin that gets FATHMM scores and predictions for missense variants. more

Pathogenicity predictions
-Ensembl

A VEP plugin that retrieves FATHMM-MKL scores for variants from a tabix-indexed
FATHMM-MKL data file. more

Pathogenicity predictions
-Ensembl

A VEP plugin that retrieves the LRG ID matching either the RefSeq or Ensembl
transcript IDs. You can obtain the 'list_LRGs_transcripts_xrefs.txt' using: more

External ID
Text::CSVStephen Kazakoff

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
adds tissue-specific transcription factor motifs from FunMotifs to VEP output. more

Motif
-Ensembl
gene2phenotype

A VEP plugin that uses G2P allelic requirements to assess variants in genes
for potential phenotype involvement. more

Phenotype data and citations
Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
runs GeneSplicer (https://ccb.jhu.edu/software/genesplicer/) to get
splice site predictions. more

Splicing predictions
Digest::MD5 qw(md5_hex)Ensembl

A VEP plugin that adds information from Geno2MP, a web-accessible database of
rare variant genotypes linked to phenotypic information. more

Phenotype data and citations
-Ensembl

A VEP plugin that retrieves gnomAD annotation from either the genome
or exome coverage files, available here:
https://gnomad.broadinstitute.org/downloads more

Frequency data
Stephen Kazakoff
Gene Ontology

A VEP plugin that retrieves Gene Ontology (GO) terms associated with
transcripts (e.g. GRCh38) or their translations (e.g. GRCh37) using custom
GFF annotation containing GO terms. more

Phenotype data and citations
-Ensembl

A VEP plugin that retrieves relevant NHGRI-EBI GWAS Catalog data given the file. more

Phenotype data and citations
Ensembl

A VEP plugin for the Ensembl Variant Effect Predictor (VEP) that returns
HGVS intron start and end offsets. To be used with --hgvs option.

HGVS
-Stephen Kazakoff

A VEP plugin that retrieves molecular interaction data for variants as reprted by IntAct database. more

Functional effect
-Ensembl
Linkage Disequilibrium

A VEP plugin that finds variants in linkage disequilibrium with any overlapping
existing variants from the Ensembl variation databases. more

Variant data
-Ensembl

The LocalID plugin allows you to use variant IDs as input without making a database connection. more

Look up
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
adds the LOEUF scores to VEP output. LOEUF stands for the "loss-of-function
observed/expected upper bound fraction." more

Pathogenicity predictions
Scalar::Util qw(looks_like_number)Ensembl
Loss-of-function

Add LoFtool scores to the VEP output. more

Pathogenicity predictions
DBIEnsembl
Leiden Open Variation Database

A VEP plugin that retrieves LOVD variation data from http://www.lovd.nl/. more

Variant data
LWP::UserAgentEnsembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
uses the Mastermind Genomic Search Engine (https://www.genomenon.com/mastermind)
to report variants that have clinical evidence cited in the medical literature.
It is available for both GRCh37 and GRCh38. more

Phenotype data and citations
-Ensembl

A VEP plugin that retrieves data from MaveDB (https://www.mavedb.org), a
database that contains multiplex assays of variant effect, including deep
mutational scans and massively parallel report assays. more

Functional effect
Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
runs MaxEntScan (http://hollywood.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq.html)
to get splice site predictions. more

Splicing predictions
Digest::MD5 qw(md5_hex)Ensembl
missense deleteriousness metric

A VEP plugin that retrieves MPC scores for variants from a tabix-indexed MPC data file. more

Pathogenicity predictions
-Ensembl
Missense Tolerance Ratio

A VEP plugin that retrieves Missense Tolerance Ratio (MTR) scores for
variants from a tabix-indexed flat file. more

Pathogenicity predictions
-
  • Slave Petrovski
  • Michael Silk

A VEP plugin that retrieves data from mutfunc db predicting destabilization of protein structure, interaction interface, and motif. more

Protein annotation
Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
finds the nearest exon junction boundary to a coding sequence variant. More than
one boundary may be reported if the boundaries are equidistant. more

Nearby features
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
finds the nearest gene(s) to a non-genic variant. More than one gene
may be reported if the genes overlap the variant or if genes are
equidistant. more

Nearby features
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
retrieves data for missense and stop gain variants from neXtProt, which is a comprehensive
human-centric discovery platform that offers integration of and navigation
through protein-related data for example, variant information, localization
and interactions (https://www.nextprot.org/). more

Protein data
JSON::XSEnsembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP)
that predicts if a variant allows the
transcript escape nonsense-mediated mRNA decay based on certain rules. more

Transcript annotation
-Ensembl

A VEP plugin that integrates data from Open Targets Genetics
(https://genetics.opentargets.org), a tool that highlights variant-centric
statistical evidence to allow both prioritisation of candidate causal variants
at trait-associated loci and identification of potential drug targets. more

Variant data
Ensembl

A VEP plugin that retrieves overlapping phenotype information. more

Phenotype data and citations
-Ensembl

A VEP plugin that adds the probabililty of a gene being
loss-of-function intolerant (pLI) to the VEP output. more

Pathogenicity predictions
Ensembl

This plugin for Ensembl Variant Effect Predictor (VEP) computes the predictions of PON-P2
for amino acid substitutions in human proteins. PON-P2 is developed and maintained by
Protein Structure and Bioinformatics Group at Lund University and is available at
http://structure.bmc.lu.se/PON-P2/. more

Pathogenicity predictions
-
  • Abhishek Niroula
  • Mauno Vihinen

A VEP plugin that retrieves data for variants from a tabix-indexed PostGAP file (1-based file). more

Phenotype data and citations
-Ensembl

The PrimateAI VEP plugin is designed to retrieve clinical impact scores of
variants, as described in https://www.nature.com/articles/s41588-018-0167-z.
Please consider citing the paper if using this plugin. more

Pathogenicity predictions
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
prints out the reference and mutated protein sequences of any
proteins found with non-synonymous mutations in the input file. more

Sequence
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
reports on the quality of the reference genome using GRC data at the location of your variants.
More information can be found at: https://www.ncbi.nlm.nih.gov/grc/human/issues more

Sequence
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
adds the REVEL score for missense variants to VEP output. more

Pathogenicity predictions
-Ensembl

A VEP plugin that reports existing variants that fall in the same codon.
This plugin requires a database connection, can not be run in offline mode

Variant data
-Ensembl

A VEP plugin that retrieves data for variants from a tabix-indexed satMutMPRA file (1-based file).
The saturation mutagenesis-based massively parallel reporter assays (satMutMPRA) measures variant
effects on gene RNA expression for 21 regulatory elements (11 enhancers, 10 promoters). more

Phenotype data and citations
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
returns a HGVSp string with single amino acid letter codes

HGVS
-Ensembl

A VEP plugin that retrieves pre-calculated annotations from SpliceAI.
SpliceAI is a deep neural network, developed by Illumina, Inc
that predicts splice junctions from an arbitrary pre-mRNA transcript sequence. more

Splicing predictions
List::Util qw(max)Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
provides more granular predictions of splicing effects. more

Splicing predictions
-Ensembl

A VEP plugin that retrieves information from overlapping structural variants. more

Structural variant data
-Ensembl

A VEP plugin to retrieve overlapping records from a given VCF file.
Values for POS, ID, and ALT, are retrieved as well as values for any requested
INFO field. Additionally, the allele number of the matching ALT is returned. more

Variant data
Joseph A. Prinz

A VEP plugin that annotates variant-transcript pairs based on a given file: more

Transcript annotation
File::BasenameEnsembl

A VEP plugin that calculates the distance from the transcription
start site for upstream variants.

Nearby features
-Ensembl

A VEP plugin that annotates the effect of 5' UTR variant especially for variant creating/disrupting upstream ORFs.
Available for both GRCh37 and GRCh38. more

Transcript annotation
Scalar::Util qw(looks_like_number)
  • Xiaolei Zhang
  • Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that
adds the pre-computed VARITY scores to predict pathogenicity of
rare missense variants to VEP output. more

Pathogenicity predictions
-Ensembl

We hope that these will serve as useful examples for users implementing new plugins. If you have any questions about the system, or suggestions for enhancements please let us know on the ensembl-dev mailing list.
We also encourage you to share any plugins you develop: we are happy to accept pull requests on the VEP_plugins git repository.

There are further published plugins available outside the VEP repository including:

  • LOFTEE a Loss-Of-Function Transcript Effect Estimator (Konrad Karczewski et al,2020)

    How it works

    Plugins are run once VEP has finished its analysis for each line of the output, but before anything is printed to the output file.
    When each plugin is called (using the run method) it is passed two data structures to use in its analysis; the first is a data structure containing all the data for the current line, and the second is a reference to a variation API object that represents the combination of a variant allele and an overlapping or nearby genomic feature (such as a transcript or regulatory region).
    This object provides access to all the relevant API objects that may be useful for further analysis by the plugin (such as the current VariationFeature and Transcript).
    Please refer to the Ensembl Variation API documentation for more details.


    Functionality

    We expect that most plugins will simply add information to the last column of the output file, the "Extra" column, and the plugin system assumes this in various places, but plugins are also free to alter the output line as desired.

    The only hard requirement for a plugin to work with VEP is that it implements a number of required methods (such as new which should create and return an instance of this plugin, get_header_info which should return descriptions of the type of data this plugin produces to be included in VEP output's header, and run which should actually perform the logic of the plugin).
    To make development of plugins easier, we suggest that users use the Bio::EnsEMBL::Variation::Utils::BaseVepPlugin module as their base class, which provides default implementations of all the necessary methods which can be overridden as required.
    Please refer to the documentation in this module for details of all required methods and for a simple example of a plugin implementation.


    Filtering using plugins

    A common use for plugins will be to filter the output in some way (for example to limit output lines to missense variants) and so we provide a simple mechanism to support this.
    The run method of a plugin is assumed to return a reference to a hash containing information to be included in the output, and if a plugin should not add any data to a particular line it should return an empty hashref. If a plugin should instead filter a line and exclude it from the output, it should return undef from its run method, this also means that no further plugins will be run on the line.
    If you are developing a filter plugin, we suggest that you use the Bio::EnsEMBL::Variation::Utils::BaseVepFilterPlugin as your base class and then you need only override the include_line method to return true if you want to include this line, and false otherwise.
    Again, please refer to the documentation in this module for more details and an example implementation of a missense filter.


    Using plugins

    In order to run a plugin you need to include the plugin module in Perl's library path somehow; by default VEP includes the ~/.vep/Plugins directory in the path, so this is a convenient place to store plugins, but you are also able to include modules by any other means (e.g using the $PERL5LIB environment variable in Unix-like systems).
    You can then run a plugin using the --plugin command line option, passing the name of the plugin module as the argument.

    For example, if your plugin is in a module called MyPlugin.pm, stored in ~/.vep/Plugins, you can run it with a command line like:

    ./vep -i input.vcf --plugin MyPlugin

    You can pass arguments to the plugin's 'new' method by including them after the plugin name on the command line, separated by commas, e.g.:

    ./vep -i input.vcf --plugin MyPlugin,1,FOO

    If your plugin inherits from BaseVepPlugin, you can then retrieve these parameters as a list from the params method.

    You can run multiple plugins by supplying multiple --plugin arguments. Plugins are run serially in the order in which they are specified on the command line, so they can be run as a pipeline, with, for example, a later plugin filtering output based on the results from an earlier plugin. Note though that the first plugin to filter a line 'wins', and any later plugins won't get run on a filtered line.


    Intergenic variants

    When a variant falls in an intergenic region, it will usually not have any consequence types called, and hence will not have any associated VariationFeatureOverlap objects. In this special case, VEP creates a new VariationFeatureOverlap that overlaps a feature of type "Intergenic".
    To force your plugin to handle these, you must add "Intergenic" to the feature types that it will recognize; you do this by writing your own feature_types sub-routine:

    sub feature_types {
        return ['Transcript', 'Intergenic'];
    }

    This will cause your plugin to handle any variation features that overlap transcripts or intergenic regions. To also include any regulatory features, you should use the generic type "Feature":

    sub feature_types {
        return ['Feature', 'Intergenic'];
    }