Gene summary

This page gives an overview of the information available at the gene level and it's composed of three sections.

At the top, the page shows the gene name and Ensembl gene ID, the full description of the gene, its synonyms, its genomic location and strand, INSDC coordinates, and its number of transcripts.

The following sections show the Transcript Table and the Summary with links to external databases, and a Gene Diagram.

TRANSCRIPT TABLE

It shows each splice variant of a gene, i.e. protein-coding and non-coding transcripts, in addition to transcript and translation length, the transcript table displays information about biotype, mapped CCDS and RefSeq IDs as well as MANE, APPRIS and TSL flags. This table is hidden by default. Each transcript is given an Ensembl Transcript ID, which is unique and stable.

SUMMARY

It provides additonal information and links to external databases:

  • Name - from official gene nomenclature commitees such as HGNC (for human) and MGI (for mouse)
  • CCDS - coding sequence IDs from the Consensus Coding Sequence Set
  • UniProtKB - protein IDs from UniProtKB that match one of the translations of this gene
  • RefSeq - Indicates if the gene has transcript(s) identified as MANE.
  • LRG - IDs from the Locus Reference Genomic (LRG) project matching the Ensembl gene
  • Ensembl version - versioning of the Ensembl gene ID
  • GRCh37 assembly - (for human only) with genomic coordinates and links to the Location and Gene views of the gene on the previous human assembly
  • Gene type - The gene type includes both status (e.g. known) and biotype (e.g. protein coding) 
  • Annotation method - It can be the Ensembl automatic, Havana manual or a merge between automatic and manual (for human, mouse, zebrafish, pig, and rat)
  • Alternative genes - IDs from the HAVANA project that match the Ensembl gene

GENE DIAGRAM

It depicts the gene and all its transcripts in the context of the genome. The image can be configured to add or remove data tracks. 

Transcripts are drawn as boxes for exons and connecting lines for introns. Filled boxes show coding sequence, and empty boxes show UTRs (untranslated regions). Transcripts drawn above the blue bar (i.e the contig) are on the forward strand, whereas transcripts below are on the reverse strand.

Transcripts are represented by different colours:

Blue, pink or grey transcripts are noncoding. Go to the transcript summary help page for more information

  • Red or gold transcripts are protein coding. Gold transcripts are identical between the annotation from Ensembl automatic pipeline and the manual annotation from HAVANA

    BIOTYPE

    it's an indicator of biological significance for genes.

    If a gene has been manually annotated (i.e. in human, mouse, zebrafish, pig, and rat), we use the biotypes assigned by the HAVANA team.

    Biotypes can be grouped into protein coding, pseudogene, long noncoding and short noncoding.