Projects using Ensembl
The Ensembl project is both a source of genome sequence related data and an open source software system that can be used to organise any such data.
Collaborations
Ensembl are active collaborators in a number of projects, contributing code, know-how and a platform from which to distribute data.
Project | Description | Code used/data provided |
---|---|---|
1000 Genomes | A web browser and ftp sites are provided to access human genetic variation catalogued by the project. | Web front-end derived from Ensembl webcode |
Blueprint | Epigenomics project analysing samples from healthy and diseased individuals. | Track Hub available on the Ensembl browser |
gEVAL | This browser can be used to inspect the reference assemblies of human, mouse and zebrafish being created by the GRC | Customised webcode, pipelines (not genebuild) and compara |
HEROIC | Functional genomics data for Mouse (Mus musculus) | Data viewable on Ensembl via DAS. |
Neanderthal Genome Browser | Preliminary assembly of Homo sapiens neanderthalensis. | Web front-end derived from Ensembl webcode, Ensembl schema databases. |
WormBase Parasite | Website presenting draft genome sequences for helminths (parasitic nematodes and flatworms) | Web front-end derived from Ensembl webcode, Ensembl schema databases, Ensembl Compara, REST API and (slightly modified) BioMart |
NextGen | A collaborative project investigating biodiversity of livestock species. | Variant call sets produced using the Ensembl VEP, viewable in the Ensembl browser. |
Quantomics | Large-scale project to analyse sequence and variation in livestock genomes. | Sequence alignments created using the Ensembl Compara pipelines. |
External projects
Our open access data and open source code mean that many projects are able to make use of Ensembl data and software without our active involvement. We're happy to list those we know about here, but if your project is e!mpowered and you're not on this list, let us know.
Project | Description | Code used |
---|---|---|
LepBase | Lepbase is the Lepidopteran genome database, providing an ensembl genome browser, a blast server, a download server and more as a set of resources for moth and butterfly genome research. | Customised webcode |
COSMIC (Catalogue of Somatic Mutations in Cancer) | Web display of somatic sequence variant/mutation data | VEP |
PomBase | A resource for Schizosaccharomyces pombe that includes structural and functional annotation, literature curation and access to large-scale data sets. | Customised webcode, pipeline, database schema and API |
GermOnline | Microarray expression database focused on germline development | Webcode and underlying data |
Bgee | Comparison of gene expression patterns between species | Ensembl data and orthologues |
InterologWalk Perl Modules | Perl modules to determine protein-protein interactions from orthology and interaction data. | API, compara |
VectorBase | Bioinformatic Resource Centre for Invertebrate Vectors of Human Pathogens | Customised Ensembl webcode, Ensembl Compara pipeline, Ensembl annotation pipeline |
Gramene | A Comparative Mapping Resource for Grains | Customised webcode, Ensembl Compara pipeline, Ensembl database used for annotation, import of the Arabidopsis Ensembl database from NASC. |
Sigenae | Sigenae EST ContigBrowser | |
TraC: Transcript Consensus | Web-based tool for visual comparison of alternative splicing isoforms | Can draw from Ensembl annotation and sequence data |
OrthoMaM | A database of alignments and trees based on orthologous exons and CDSs for mammalian species. | 1-to-1 orthologues extracted from Ensembl. |
Otter | Database backend for interactive curation of annotation | Otter is an extension of the Ensembl database schema |
Selectome | Database of positive selection. | Uses Ensembl gene trees, gene data and xrefs extracted via the APIs. |
CZ CELLxGENE Discover | A free-to-use online suite of tools that help researchers discover, download, and analyze single-cell datasets from modalities that include gene expression, chromatin accessibility, DNA methylation, and spatial transcriptomics. | All data in the portal have been standardized and annotated using an ontological shared vocabulary for cell and gene metadata. It uses an Ensembl gene reference across all data. |
If you are using Ensembl code in your project, you might like to download the 'empowered' logo (high-resolution PNG, suitable for use in print).
User-contributed code
Whilst we have developed a comprehensive Perl API in-house, we welcome contributions in other programming languages from the community.
R - ensemblQueryR
ensemblQueryR provides an R interface to the Ensembl REST API permitting fast, flexible, user-friendly querying. Docker and Singularity images available.
Ruby
A Ruby API has been developed by Jan Aerts. A small example, showing what it can do (including coordinate transformation and reflecting on the types of associations for a given class) can be found on the wiki page.
Java
An open source Java API (JEnsembl) has been developed by the Bioinformatics Group at The Roslin Institute. The code has been designed to be version-aware (a single installation works against current and previous releases) and is hosted, together with documentation, examples and contact information for the Roslin group, at SourceForge.
Note that we do not support these user-contributed packages - please contact the original developer if you have any questions or comments.