Transcript haplotype
For any given transcript, this page displays actual haplotypes of variants found in that transcript or protein in 1000 Genomes individuals.
These are shown as a table listing the haplotype as a series of alterations from the reference sequence. Protein haplotypes are shown like 523R>Q indicating the amino acid position, the reference amino acid and the alternative amino acid. CDS haplotypes are shown like 1568G>A indicating the base position in the CDS, the reference base and the alternative base.. Where a haplotype has multiple variants, these are shown separated by commas.
The table also shows the overall frequency of that haplotype in the 1000 Genomes group, as well as in the super-populations studied by the 1000 Genomes individuals. You can view this table for either protein or CDS haplotypes.
To get more details on a haplotype, click on it to jump to a section that opens up at the bottom of the page. If you scroll back up and choose another haplotype, this will jump back down with the new haplotype.
The details section includes information on Population frequencies, Aligned sequence, Sequence, Corresponding protein or CDS haplotypes (depending on which you are in to start with) and Sample data.
The Population frequency section lists all the 1000 Genomes sub-populations where the haplotype was observed. If the haplotype is oberved in a population, its frequency is shown as a bar graph, a frequency and a count. The populations are sorted by super-population; hover over the population codes to get the full name.
The Aligned sequence section displays an alignment between the reference protein sequence, matches or mismatches with the haplotype protein sequence, the reference CDS sequence and matches or mismatches with the haplotype CDS sequence. Codons in the CDS are shown with yellow highlighting. Amino acid changes are highlighted with a colour indicating the likely effect on the protein function; refer to the legend at the top to see what the colours mean. Click on the variants to get more information, such as dbSNP ID, and go to the variant tab.
Since multiple CDS haplotypes can give one protein haplotype, due to synonymous changes, an alignment on a protein haplotype may have multiple lines for each alternative CDS haplotype. A CDS haplotype will only have one alternative protein haplotype.
The Sequence section shows the complete protein or CDS sequence of the haplotype.
The Corresponding CDS haplotype section is shown for protein haplotypes and lists all the possible CDS haplotypes for that protein haplotype. The Corresponding protein haplotype section is shown for a CDS haplotype and shows the one protein haplotype produced by that CDS haplotype.
The Sample data table lists all the 1000 Genomes individuals who have that particular haplotype, using their unique identifiers. Their population codes are shown. Copies indicate whether the individual is homozygous (2) or heterozygous (1) for that haplotype.
For more information on transcript haplotypes, please see this article.
William Spooner, William McLaren, Timothy Slidel, Donna K. Finch, Robin Butler, Jamie Campbell, Laura Eghobamien, David Rider,
Christine Mione Kiefer, Matthew J. Robinson, Colin Hardman, Fiona Cunningham, Tristan Vaughan, Paul Flicek & Catherine Chaillan Huntington.
Haplosaurus computes protein haplotypes for use in precision drug design.
Nature Communications volume 9, Article number: 4128 (2018)
https://www.nature.com/articles/s41467-018-06542-1