Available databases


Phylogenetic ranges of available individual databases secondarily compiled are shown in the tree below.

 

Descriptions of the original sources of the above databases are given below.

Database 1. Human - Refseq

Human peptides registered in RefSeq were downloaded from NCBI Protein in June 14, 2021.

Database 2. Human - Ensembl 104
All 'known' human peptide sequences were downloaded from the Ensembl FTP site.

Database 3. Non-human eutherians - Ensembl 104
Peptide sequences for 'known' genes of non-human eutherian species listed here were downloaded from the Ensembl FTP site.

Database 4. Non-eutherian mammals - Ensembl 104
Peptide sequences for 'known' genes of non-eutherian mammals listed here were downloaded from the Ensembl FTP site.

Database 5. Non-mammalian bony vertebrates - Ensembl 104 and others
Peptide sequences for 'known' genes of all non-mammalian vertebrates listed here  were downloaded from the Ensembl FTP site. Plus, peptides predicted in the genomes of the green sea turtle (Chelonia mydas) and the painted turtle (Chrysemys picta) were downloaded from NCBI Protein under their BioProject entries [ Chelonia | Chrysemys ]. The peptide sequence set was downloaded from GigaDB for the Chinese alligator. Most recently released gene models for the Madagascar ground gecko produced by ourselves (Hara et al., 2018. BMC Biol.) have also been included.

Database 6. Cartilaginous fish (chondrichthyans) and cyclostomes
Peptide sequences predicted in the whole genome assembly of elephant shark (published in Nature) was downloaded from here, and those of three elasmobranch species (brownbanded bamboo shark, whale shark, and cloudy catshark; Hara, Yamaguchi et al., 2018) available at Squalomix have also been included. Peptide sequences for the sea lamprey (Petromyzon marinus) predicted by the genome annotation consortium (Smith et al., 2013. Nat. Genet.) has also been included in this database, in addition to its predicted peptides by Ensembl. Peptides predicted on the publicly available genome assembly of Lethenteron camtschaticum (Japanese lamprey or Arctic lamprey) (LetJap1.0) released by Venkatesh Lab are derived from our own gene model inference and available at our own laboratory web site

Database 7. All vertebrate entries except mammalians in NCBI Protein
Peptide sequences excluding those of mammals registered for the taxonomic group 'Vertebrata' [ link ] were downloaded in late July, 2021.

Database 8. Invertebrate deuterostomes
Peptide sequences of the species included below were downloaded from the FTP site of Ensembl Genome:

sea urchin (Strongylocentrotus purpuratus)
European lancelet (Branchiostoma lanceolatum)

Peptide sequences based on the genome assemblies of Ciona intestinalis and Ciona savignyi were downloaded from the Ensembl FTP site. Sequences of Oikopleura dioica were downloaded from the individual web sites of the genome projects. Peptide sequences based on the version 2 Branchiostoma floridae genome assembly by JGI were downloaded from NCBI Protein. Most recently the gene models for two hemichordate species, Saccoglossus kowalevskii and Ptychodera flava,  have also been included.

Database 9. Arthropods - EnsemblGenomes 51
Peptide sequences of the species listed below were downloaded from the FTP site of Ensembl Genome:

waterflea Daphnia pulex
human louse Pediculus humanus
leaf cutter ant Atta cephalotes
red fire ant Solenopsis invicta
honeybee Apis mellifera
monarch butterfly Danaus plexippus
silkworm Bombyx mori
African malaria mosquito Anopheles gambiae
yellow fever mosquito Aedes aegypti
house mosquito Culex quinquefasciatus
12 Drosophila species including Drosophila melanogaster
postman butterfly Heliconius melpomene
small pteromalid parasitoid wasp Nasonia vitripennis
deer tick or blacklegged tick Ixodes scapularis
red flour beetle Tribolium castaneum
mountain pine beetle Dendroctonus ponderosae
pea aphid Acyrthosiphon pisum
Glanville fritillary Melitaea cinxia
termite Zootermopsis nevadensis
Antarctic midge Belgica antarctica
common eastern bumblebee Bombus impatiens
sea louse Lepeophtheirus salmonis
Australian sheep blowfly Lucilia cuprina
itch mite Sarcoptes scabiei
African social velvet spider Stegodyphus mimosarum
Asian long-horn beetle Anoplophora glabripennis
buff-tailed bumblebee Bombus terrestris
Hessian fly Mayetiola destructor

Database 10. Nematodes - EnsemblGenomes 51 and others
Peptide sequences of below species were downloaded from the FTP site of Ensembl Genome:

Trichinella spiralis
Caenorhabditis elegans
Caenorhabditis brenneri
Caenorhabditis brigssae
Caenorhabditis japonica
Caenorhabditis remanei
Brugia malayi
Pristionchus pacificus
Loa loa
Onchocerca volvulus
Strongyloides ratti

Predicted peptides for Meloidogyne incognita were downloaded from its genome project web page.

Database 11. Other protostomes - EnsemblGenomes 51 and others
Peptide sequences of below species were downloaded from the FTP site of Ensembl Genome:

Schistosoma mansoni
Capitella teleta
Helobdella robusta
Pacific oyster Crassostrea gigas
Octopus bimaculoides
Lingula anatina
bdelloid rotifer Adineta vaga

Peptide sequences of predicted genes for other lophotrochozoan species were downloaded from individual web project sites (Schistosoma japonicum and pearl oyster Pinctada fucata)

Database 12. Non-bilaterian metazoans (cnidarians, ctenophoran, placozoan & poriferan) - EnsemblGenomes 51 and others
Peptide sequences of the species listed below were downloaded from the FTP site of Ensembl Genome:

placozoan Trichoplax adherens
Nematostella vectensis
poriferan Amphimedon queenslandica
comb jellyfish Mnemiopsis leidyi
myxosporean Thelohanellus kitauei

Predicted peptide sequences were downloaded from individual sources for 2 cnidarians (Acropora digitifera and Hydra magnipapillata).

Database 13. All metazoan entries except vertebrates in NCBI Protein
Peptide sequences excluding those of vertebrates registered for the taxonomic group 'Metazoa' [ link ] were downloaded in late July, 2021. 


Notes
Please contact the chief administrator if you know of any source of sequences that seems more authentic or comprehensive than the one covered here.

Back to top