EnsemblEnsembl Home

About Ensembl Variation

The Ensembl Variation database stores areas of the genome that differ between individual genomes ("variants") and, where available, associated disease and phenotype information.
There are different types of variants for several species:

  • single nucleotide polymorphisms (SNPs)
  • short nucleotide insertions and/or deletions
  • longer variants classified as structural variants (including CNVs)

Below is a list of the most common types of variants stored in the Ensembl Variation databases:

Sequence variants

Type Description Example (Reference / Alternative)
SNP Single Nucleotide Polymorphism Ref:
Insertion Insertion of one or several nucleotides Ref:
Deletion Deletion of one or several nucleotides Ref:
Indel An insertion and a deletion, affecting 2 or more nucleotides Ref:
Substitution A sequence alteration where the length of the change in the variant is the same as that of the reference. Ref:

Structural variants

Type Description Example (Reference / Alternative)
CNV Copy Number Variation: increases or decreases the copy number of a given region Reference:
"Gain" of one copy:

"Loss" of one copy:
Inversion A continuous nucleotide sequence is inverted in the same position Reference:
Translocation A region of nucleotide sequence that has translocated to a new position Reference:

These are only some example of variant types you can find in Ensembl. The full list is available here.

We predict the effects of variants on the Ensembl transcripts and regulatory features for each species. You can run the same analysis on your own data using the Variant Effect Predictor.
These data are integrated with other data sources in Ensembl, and can be accessed using the API (see links on the right handside menu) or website.

Here are some webpage examples:

Perl API

A comprehensive Perl Application Programme Interface (API) provides efficient access to the Ensembl Variation database.

MySQL database

VCF import

The import_vcf.pl script populates an Ensembl Variation database from a VCF (Variant Call Format) file. A description of the VCF file format can be found on the 1000 Genomes project website.
The script can either populate a database from scratch, or add data to an existing database.


  • Bronwen L. Aken, Premanand Achuthan, Wasiu Akanni, M.Ridwan Amode, Friederike Bernsdorff, Jyothish Bhai, Konstantinos Billis, Denise Carvalho-Silva, Carla Cummins, Peter Clapham, Laurent Gil, Carlos García Girón, Leo Gordon, Thibaut Hourlier, Sarah E. Hunt, Sophie H. Janacek, Thomas Juettemann, Stephen Keenan, Matthew R. Laird, Ilias Lavidas, Thomas Maurel, William McLaren, Benjamin Moore, Daniel N. Murphy, Rishi Nag, Victoria Newman, Michael Nuhn, Chuang Kee Ong, Anne Parker, Mateus Patricio, Harpreet Singh Riat, Daniel Sheppard, Helen Sparrow, Kieron Taylor, Anja Thormann, Alessandro Vullo, Brandon Walts, Steven P. Wilder, Amonida Zadissa, Myrto Kostadima, Fergal J. Martin, Matthieu Muffato, Emily Perry, Magali Ruffier, Daniel M. Staines, Stephen J. Trevanion, Fiona Cunningham, Andrew Yates, Daniel R. Zerbino and Paul Flicek
    Ensembl 2017
    Nucleic Acids Research
    doi: 10.1093/nar/gkw1104

  • McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, Flicek P and Cunningham F.
    The Ensembl Variant Effect Predictor
    Genome Biology 17:122(2016)

  • Rios D, McLaren WM, Chen Y, Birney E, Stabenau A, Flicek P, Cunningham F.
    A Database and API for variation, dense genotyping and resequencing data
    BMC Bioinformatics 11:238 (2010)

  • Chen Y, Cunningham F, Rios D, McLaren WM, Smith J, Pritchard B, Spudich GM, Brent S, Kulesha E, Marin-Garcia P, Smedley D, Birney E, Flicek P.
    Ensembl Variation Resources
    BMC Genomics 11(1):293 (2010)