Variant Effect Predictor Plugins


VEP can use plugin modules written in Perl to extend, filter and manipulate the VEP output.

To use plugins with VEP, you can:

  • Install them using VEP's installer script. You can quickly check installed plugins by running:
    • perl INSTALL.pl -a p -g list
  • Use Ensembl VEP in Docker and Singularity. VEP plugins and their dependencies are available in the Docker image.
  • Use the VEP web and REST interfaces. Not all plugins are available therein and they may have limited options.

Existing plugins

We have written several plugins that implement experimental functionalities that we do not (yet) include in the variation API, and these are stored in a public github repository:

https://github.com/Ensembl/VEP_plugins

Here is the list of the VEP plugins available:

Select categories:
Plugin Description Category External libraries Developer

This plugin for the Ensembl Variant Effect Predictor (VEP) annotates missense variants with the pre-computed AlphaMissense pathogenicity scores. AlphaMissense is a deep learning model developed by Google DeepMind that predicts the pathogenicity of single nucleotide missense variants. more

Pathogenicity predictions
-Ensembl

A VEP plugin that retrieves ancestral allele sequences from a FASTA file. more

Conservation
-Ensembl

Automatic VAriant evidence DAtabase is a novel machine learning tool that uses natural language processing to automatically identify pathogenic genetic variant evidence in full-text primary literature about monogenic disease and convert it to genomic coordinates. more

Phenotype data and citations
List::MoreUtils qw(uniq)Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds the BayesDel scores to VEP output. more

Pathogenicity predictions
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that looks up the BLOSUM 62 substitution matrix score for the reference and alternative amino acids predicted for a missense mutation. It adds one new entry to the VEP's Extra column, BLOSUM62 which is the associated score. more

Conservation
-Ensembl
Combined Annotation Dependent Depletion

A VEP plugin that retrieves CADD scores for variants from one or more tabix-indexed CADD data files. more

Pathogenicity predictions
-Ensembl

A VEP plugin that retrieves CAPICE scores for variants from one or more tabix-indexed CAPICE data files, in order to predict their pathogenicity. more

Pathogenicity predictions
-Ensembl

A VEP plugin that calculates the Combined Annotation scoRing toOL (CAROL) score (1) for a missense mutation based on the pre-calculated SIFT (2) and PolyPhen-2 (3) scores from the Ensembl API (4). more

Pathogenicity predictions
Math::CDF qw(pnorm qnorm)Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds pre-calculated scores from ClinPred. ClinPred is a prediction tool to identify disease-relevant nonsynonymous variants. more

Pathogenicity predictions
-Ensembl

A VEP plugin that calculates the Consensus Deleteriousness (Condel) score (1) for a missense mutation based on the pre-calculated SIFT (2) and PolyPhen-2 (3) scores from the Ensembl API (4). more

Pathogenicity predictions
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that retrieves a conservation score from the Ensembl Compara databases for variant positions. You can specify the method link type and species sets as command line options, the default is to fetch GERP scores from the EPO 35 way mammalian alignment (please refer to the Compara documentation for more details of available analyses). more

Conservation
Net::FTPEnsembl

A VEP plugin that retrieves data for missense variants from a tabix-indexed dbNSFP file. more

Pathogenicity predictions
File::Basename qw(basename)Ensembl

A VEP plugin that retrieves data for splicing variants from a tabix-indexed dbscSNV file. more

Splicing predictions
-Ensembl

A VEP plugin that identifies de novo variants in a VCF file. The plugin is not compatible with JSON output format. more

Variant data
Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds Variant-Disease-PMID associations from the DisGeNET database. It is available for GRCh38. more

Phenotype data and citations
List::MoreUtils qw(uniq)Ensembl

A VEP plugin that retrieves haploinsufficiency and triplosensitivity probability scores for affected genes from a dosage sensitivity catalogue published in paper - https://www.sciencedirect.com/science/article/pii/S0092867422007887 more

Gene tolerance to change
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that predicts the downstream effects of a frameshift variant on the protein sequence of a transcript. It provides the predicted downstream protein sequence (including any amino acids overlapped by the variant itself), and the change in length relative to the reference protein. more

Nearby features
-Ensembl

A VEP plugin that draws pictures of the transcript model showing the variant location. more

Visualisation
Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds pre-calculated Enformer predictions of variant impact on chromatin and gene expression. more

Regulatory impact
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds information from EVE (evolutionary model of variant effect). more

Pathogenicity predictions
-Ensembl

A VEP plugin that gets FATHMM scores and predictions for missense variants. more

Pathogenicity predictions
-Ensembl

A VEP plugin that retrieves FATHMM-MKL scores for variants from a tabix-indexed FATHMM-MKL data file. more

Pathogenicity predictions
-Ensembl

A VEP plugin that retrieves the LRG ID matching either the RefSeq or Ensembl transcript IDs. more

External ID
Text::CSVStephen Kazakoff

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds tissue-specific transcription factor motifs from FunMotifs to VEP output. more

Motif
-Ensembl
gene2phenotype

A VEP plugin that uses G2P allelic requirements to assess variants in genes for potential phenotype involvement. more

Phenotype data and citations
Ensembl

A user-contributed VEP plugin that retrieves automatic ACMG variant classification data from https://genebe.net/ more

Variant data
JSON
  • Ensembl
  • Piotr Stawinski

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that runs GeneSplicer (https://ccb.jhu.edu/software/genesplicer/) to get splice site predictions. more

Splicing predictions
Digest::MD5 qw(md5_hex)Ensembl

A VEP plugin that adds information from Geno2MP, a web-accessible database of rare variant genotypes linked to phenotypic information. more

Phenotype data and citations
-Ensembl

A VEP plugin that retrieves gnomAD annotation from either the genome or exome coverage files, available here: https://gnomad.broadinstitute.org/downloads more

Frequency data
Stephen Kazakoff
Gene Ontology

A VEP plugin that retrieves Gene Ontology (GO) terms associated with transcripts (e.g. GRCh38) or their translations (e.g. GRCh37) using custom GFF annotation containing GO terms. more

Phenotype data and citations
-Ensembl

A VEP plugin that retrieves relevant NHGRI-EBI GWAS Catalog data given the file. more

Phenotype data and citations
Ensembl

A VEP plugin for the Ensembl Variant Effect Predictor (VEP) that returns HGVS intron start and end offsets. To be used with --hgvs option. more

HGVS
-Stephen Kazakoff

A VEP plugin that retrieves molecular interaction data for variants as reprted by IntAct database. more

Functional effect
-Ensembl
Linkage Disequilibrium

A VEP plugin that finds variants in linkage disequilibrium with any overlapping existing variants from the Ensembl variation databases. more

Variant data
-Ensembl

The LocalID plugin allows you to use variant IDs as input without making a database connection. more

Look up
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds the LOEUF scores to VEP output. LOEUF stands for the "loss-of-function observed/expected upper bound fraction." more

Gene tolerance to change
Scalar::Util qw(looks_like_number)Ensembl
Loss-of-function

Add LoFtool scores to the VEP output. more

Pathogenicity predictions
DBIEnsembl
Leiden Open Variation Database

A VEP plugin that retrieves LOVD variation data from http://www.lovd.nl/. more

Variant data
LWP::UserAgentEnsembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that uses the Mastermind Genomic Search Engine (https://www.genomenon.com/mastermind) to report variants that have clinical evidence cited in the medical literature. It is available for both GRCh37 and GRCh38. more

Phenotype data and citations
-Ensembl

A VEP plugin that retrieves data from MaveDB (https://www.mavedb.org), a database that contains multiplex assays of variant effect, including deep mutational scans and massively parallel report assays. more

Functional effect
Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that runs MaxEntScan (http://hollywood.mit.edu/burgelab/maxent/Xmaxentscan_scoreseq.html) to get splice site predictions. more

Splicing predictions
Digest::MD5 qw(md5_hex)Ensembl
missense deleteriousness metric

A VEP plugin that retrieves MPC scores for variants from a tabix-indexed MPC data file. more

Pathogenicity predictions
-Ensembl
Missense Tolerance Ratio

A VEP plugin that retrieves Missense Tolerance Ratio (MTR) scores for variants from a tabix-indexed flat file. more

Pathogenicity predictions
-
  • Slave Petrovski
  • Michael Silk

A VEP plugin that retrieves data from mutfunc db predicting destabilization of protein structure, interaction interface, and motif. more

Protein annotation
Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that finds the nearest exon junction boundary to a coding sequence variant. More than one boundary may be reported if the boundaries are equidistant. more

Nearby features
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that finds the nearest gene(s) to a non-genic variant. More than one gene may be reported if the genes overlap the variant or if genes are equidistant. more

Nearby features
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that retrieves data for missense and stop gain variants from neXtProt, which is a comprehensive human-centric discovery platform that offers integration of and navigation through protein-related data for example, variant information, localization and interactions (https://www.nextprot.org/). more

Protein data
JSON::XSEnsembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that predicts if a variant allows the transcript escape nonsense-mediated mRNA decay based on certain rules. more

Transcript annotation
-Ensembl

A VEP plugin that integrates data from Open Targets Genetics (https://genetics.opentargets.org), a tool that highlights variant-centric statistical evidence to allow both prioritisation of candidate causal variants at trait-associated loci and identification of potential drug targets. more

Variant data
Ensembl

A VEP plugin that fetches variants overlapping the genomic coordinates of amino acids aligned between paralogue proteins. This is useful to predict the pathogenicity of variants in paralogue positions. more

Pathogenicity predictions
Ensembl

A VEP plugin that retrieves phenotype information associated with orthologous genes from model organisms. more

Phenotype data and citations
-Ensembl

A VEP plugin that retrieves overlapping phenotype information. more

Phenotype data and citations
-Ensembl

A VEP plugin that adds the probabililty of a gene being loss-of-function intolerant (pLI) to the VEP output. more

Gene tolerance to change
Ensembl

A VEP plugin that retrieves PolyPhen and SIFT predictions from a locally constructed SQLite database. It can be used when your main source of VEP transcript annotation (e.g. a GFF file or GFF-based cache) does not contain these predictions. more

Pathogenicity predictions
Ensembl

This plugin for Ensembl Variant Effect Predictor (VEP) computes the predictions of PON-P2 for amino acid substitutions in human proteins. more

Pathogenicity predictions
-
  • Abhishek Niroula
  • Mauno Vihinen

A VEP plugin that retrieves data for variants from a tabix-indexed PostGAP file (1-based file). more

Phenotype data and citations
-Ensembl

The PrimateAI VEP plugin is designed to retrieve clinical impact scores of variants, as described in https://www.nature.com/articles/s41588-018-0167-z. Please consider citing the paper if using this plugin. more

Pathogenicity predictions
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that prints out the reference and mutated protein sequences of any proteins found with non-synonymous mutations in the input file. more

Sequence
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that reports on the quality of the reference genome using GRC data at the location of your variants. More information can be found at: https://www.ncbi.nlm.nih.gov/grc/human/issues more

Sequence
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds the REVEL score for missense variants to VEP output. more

Pathogenicity predictions
-Ensembl

This is a VEP plugin that uses a standardized catalog of human Ribo-seq ORFs to re-calculate consequences for variants located in these translated regions. more

Transcript annotation
-Ensembl

A VEP plugin that reports existing variants that fall in the same codon. This plugin requires a database connection, can not be run in offline mode more

Variant data
-Ensembl

A VEP plugin that retrieves data for variants from a tabix-indexed satMutMPRA file (1-based file). The saturation mutagenesis-based massively parallel reporter assays (satMutMPRA) measures variant effects on gene RNA expression for 21 regulatory elements (11 enhancers, 10 promoters). more

Phenotype data and citations
-Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that returns a HGVSp string with single amino acid letter codes more

HGVS
-Ensembl

A VEP plugin that retrieves pre-calculated annotations from SpliceAI. SpliceAI is a deep neural network, developed by Illumina, Inc that predicts splice junctions from an arbitrary pre-mRNA transcript sequence. more

Splicing predictions
List::Util qw(max)Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that provides more granular predictions of splicing effects. more

Splicing predictions
-Ensembl

A VEP plugin that retrieves SpliceVault data to predict exon-skipping events and activated cryptic splice sites based on the most common mis-splicing events around a splice site. more

Splicing predictions
-Ensembl

A VEP plugin that retrieves information from overlapping structural variants. more

Structural variant data
-Ensembl

A VEP plugin to retrieve overlapping records from a given VCF file. Values for POS, ID, and ALT, are retrieved as well as values for any requested INFO field. Additionally, the allele number of the matching ALT is returned. more

Variant data
Joseph A. Prinz

A VEP plugin that annotates variant-transcript pairs based on a given file: more

Transcript annotation
File::BasenameEnsembl

A VEP plugin that calculates the distance from the transcription start site for upstream variants. more

Nearby features
-Ensembl

A VEP plugin that annotates the effect of 5' UTR variant especially for variant creating/disrupting upstream ORFs. Available for both GRCh37 and GRCh38. more

Transcript annotation
Ensembl

This is a plugin for the Ensembl Variant Effect Predictor (VEP) that adds the pre-computed VARITY scores to predict pathogenicity of rare missense variants to VEP output. more

Pathogenicity predictions
-Ensembl

We hope that these will serve as useful examples for users implementing new plugins. If you have any questions about the system, or suggestions for enhancements please let us know on the ensembl-dev mailing list.
We also encourage you to share any plugins you develop: we are happy to accept pull requests on the VEP_plugins git repository.

There are further published plugins available outside the VEP repository including:

  • LOFTEE a Loss-Of-Function Transcript Effect Estimator (Konrad Karczewski et al,2020)

    How it works

    Plugins are run once VEP has finished its analysis for each line of the output, but before anything is printed to the output file.

    When each plugin is called (using the run method) it is passed two data structures to use in its analysis; the first is a data structure containing all the data for the current line, and the second is a reference to a variation API object that represents the combination of a variant allele and an overlapping or nearby genomic feature (such as a transcript or regulatory region).

    This object provides access to all the relevant API objects that may be useful for further analysis by the plugin (such as the current VariationFeature and Transcript). Please refer to the Ensembl Variation API documentation for more details.


    Functionality

    We expect that most plugins will simply add information to the last column of the output file, the "Extra" column, and the plugin system assumes this in various places, but plugins are also free to alter the output line as desired.

    The only hard requirement for a plugin to work with VEP is that it implements a number of required methods (such as new which should create and return an instance of this plugin, get_header_info which should return descriptions of the type of data this plugin produces to be included in VEP output's header, and run which should actually perform the logic of the plugin).

    To make development of plugins easier, we suggest that users use the Bio::EnsEMBL::Variation::Utils::BaseVepPlugin module as their base class, which provides default implementations of all the necessary methods which can be overridden as required. Please refer to the documentation in this module for details of all required methods and for a simple example of a plugin implementation.


    Filtering using plugins

    A common use for plugins will be to filter the output in some way (for example to limit output lines to missense variants) and so we provide a simple mechanism to support this.

    The run method of a plugin is assumed to return a reference to a hash containing information to be included in the output, and if a plugin should not add any data to a particular line it should return an empty hashref. If a plugin should instead filter a line and exclude it from the output, it should return undef from its run method, this also means that no further plugins will be run on the line.

    If you are developing a filter plugin, we suggest that you use the Bio::EnsEMBL::Variation::Utils::BaseVepFilterPlugin as your base class and then you need only override the include_line method to return true if you want to include this line, and false otherwise. Again, please refer to the documentation in this module for more details and an example implementation of a missense filter.


    Using plugins

    In order to run a plugin you need to include the plugin module in Perl's library path somehow; by default VEP includes the ~/.vep/Plugins directory in the path, so this is a convenient place to store plugins, but you are also able to include modules by any other means (e.g using the $PERL5LIB environment variable in Unix-like systems).
    You can then run a plugin using the --plugin command line option, passing the name of the plugin module as the argument.

    For example, if your plugin is in a module called MyPlugin.pm, stored in ~/.vep/Plugins, you can run it with a command line like:

    ./vep -i input.vcf --plugin MyPlugin

    You can pass arguments to the plugin's 'new' method by including them after the plugin name on the command line, separated by commas, e.g.:

    ./vep -i input.vcf --plugin MyPlugin,1,FOO

    If your plugin inherits from BaseVepPlugin, you can then retrieve these parameters as a list from the params method.

    You can run multiple plugins by supplying multiple --plugin arguments. Plugins are run serially in the order in which they are specified on the command line, so they can be run as a pipeline, with, for example, a later plugin filtering output based on the results from an earlier plugin. Note though that the first plugin to filter a line 'wins', and any later plugins won't get run on a filtered line.

    Plugin warnings in VEP

    By default, a VEP run does not fail even when a plugin raises warnings or compilation errors. To avoid this, please use the --safe flag. This is particularly useful to ensure plugins run properly when using VEP in pipelines.


    Intergenic variants

    When a variant falls in an intergenic region, it will usually not have any consequence types called, and hence will not have any associated VariationFeatureOverlap objects. In this special case, VEP creates a new VariationFeatureOverlap that overlaps a feature of type "Intergenic".

    To force your plugin to handle these, you must add "Intergenic" to the feature types that it will recognize; you do this by writing your own feature_types sub-routine:

    sub feature_types {
        return ['Transcript', 'Intergenic'];
    }

    This will cause your plugin to handle any variation features that overlap transcripts or intergenic regions. To also include any regulatory features, you should use the generic type "Feature":

    sub feature_types {
        return ['Feature', 'Intergenic'];
    }