The Allele Frequency Calculator

The Allele Frequency Calculator is a tool that allows you retreive frequency data for variants identified in the 1000 Genomes Project for a genomic region of interest.

When you reach the Allele Frequency Calculator web interface, you will be presented with a form to define the allele frequency data to want to retreive.

Name for this job (optional): naming each of your data requests with a unique name allows you to track and search the list of your submitted jobs.

Species: The Allele Frequency Calculator is based on population frequency data generated by the 1000 Genomes project, and is therefore only available for the human GRCh37 assembly, which is selected by default.

Region Lookup: Define your genomic region of interest in the format chromosome#:Start_coordinate-End_coordinate e.g 4:122868000-122946000.

Choose data collections or provide your own file URLs: Select the phase of the 1000 Genomes project for which you wish to retreive frequency data. 

Select Phase 3 / Phase 1 populations: If you have selected either 'Phase 3' or 'Phase 1' from the 'Choose data collections or provide your own file URLs' section (above), you are now able to select the populations of the 1000 Genomes project you wish to retreive frequency data. By default, 'ALL' is selected, which will return the frequency data for all the 1000 Genomes populations combined. You are also able to select one, or more, of the individual populations from the 1000 Genomes Project, to retreive the frequency data for particular populations of interest.

Note: You can select 'ALL' and multiple populations in one query, which will return the frequency data for all 1000 Genomes populations combined, followed by the frequency data for each selected individual population.

If you have selected 'Provide file URLs' from the 'Choose data collections or provide your own file URLs' section (above), you are now able to define URLs that contain files that contain the variation and frequency data you want the Allele Frequency Calculator to use in its calculation.

Genotype file URL: Define a URL that contains a VCF file that contains the population genotypes.

Sample-population mapping file URL: Define a URL that contains a file which lists all the individuals and the populations from which they come.

Output: The output of the calculator can be previewed on the web page and an output file can be downloaded. The header is:

CHR: Chromosome 
POS: Start position of the variant 
ID: Identification of the variant 
REF: Reference allele 
ALT: Alternative allele 
TOTAL_CNT: Total number of alleles in samples of the chosen population(s) 
ALT_CNT: Number of alternative alleles observed in samples of the chosen populations(s)