Cave fish assembly and gene annotation

Assembly

The Astyanax_mexicanus-2.0 assembly was submitted by Washington University School of Medince on September 2017. The assembly is on chromosome level, consisting of 3,030 contigs assembled into 2,415 scaffolds. From these sequences, 25 chromosomes have been built. The N50 size is the length such that 50% of the assembled genome lies in blocks of the N50 size or longer. The N50 length for the contigs is 1,767,240 while the scaffold N50 is 35,377,769.

The genome assembly represented here corresponds to GenBank Assembly ID GCA_000372685.1

Other assemblies

Gene annotation

The Mexican tetra or blind cave fish (Astyanax mexicanus) is a freshwater fish of the family Characidae of the order Characiformes. The type species of its genus, it is native to the Nearctic ecozone, originating in the lower Rio Grande and the Neueces and Pecos Rivers in Texas, as well as the central and eastern parts of Mexico.Growing to a maximum total length of 12 cm (4.7 in), the Mexican tetra is of typical characin shape, with unremarkable, drab coloration. Its blind cave form, however, is notable for having no eyes or pigment; it has a pinkish-white color to its body (resembling an albino).This fish, especially the blind variant, is reasonably popular among aquarists.A. mexicanus is a peaceful species that spends most of its time in midlevel water above the rocky and sandy bottoms of pools and backwaters of creeks and rivers of its native environment. Coming from a subtropical climate, it prefers water with 6.5–8 pH, a hardness of up to 30 dGH, and a temperature range of 20 to 25 °C (68 to 77 °F). In the winter, some populations migrates to warmer waters. Its natural diet consists of crustaceans, insects, and annelids, although in captivity it is omnivorous.The Mexican tetra has been treated as a subspecies of A. fasciatus, but this is not widely accepted. Additionally, the blind cave form is sometimes recognized as a separate species, A. jordani, but this directly contradicts phylogenetic evidence.

The gene annotation process was carried out using a combination of protein-to-genome alignments, annotation mapping from a suitable reference species and RNA-seq alignments (where RNA-seq data with appropriate meta data were publicly available). For each candidate gene region, a selection process was applied to choose the most appropriate set of transcripts based on evolutionary distance, experimental evidence for the source data and quality of the alignments.
Small ncRNAs were obtained using a combination of BLAST and Infernal/RNAfold.
Pseudogenes were calculated by looking at genes with a large percentage of non-biological introns (introns of <10bp), where the gene was covered in repeats, or where the gene was single exon and evidence of a functional multi-exon paralog was found elsewhere in the genome.
lincRNAs were generated via RNA-seq data where no evidence of protein homology or protein domains could be found in the transcript.

In accordance with the Fort Lauderdale Agreement , please check the publication status of the genome/assembly before publishing any genome-wide analyses using these data.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyAstMex102, INSDC Assembly GCA_000372685.1, Apr 2013
Base Pairs964,248,202
Golden Path Length1,191,242,572
Annotation methodFull genebuild
Genebuild startedApr 2013
Genebuild releasedDec 2013
Genebuild last updated/patchedDec 2013
Database version94.102

Gene counts

Coding genes23,042
Pseudogenes21
Gene transcripts24,428

Other

Genscan gene predictions39,152

About this species