Reference Databases for Metabarcoding
Metabarcoding Reference Databases
These databases were made using the CRUX Pipeline, part of the Anacapa Toolkit (Curd et al., 2019 in MEE).
We update these databases annually. If you are using a different primer or locus, we encourage you to make your own CRUX database. Let us know if you want additional reference libraries or if you need help making your own.
These were made using select groups in the STD database from EMBL as the seed sequences for EcoPCR. Several collaborators are exploring alternatives, such as using the entire SRA as seed sequences, which would boost sequence inclusion such as prokaryotic draft genomes that are more common online than Sanger-based barcoding loci. We encourage you to share your experiences, explorations, and preferences with us! EMBL data were downloaded in late July 2019, and the NCBI SRA was downloaded October 2019 .
Links to batch download the CRUX databases for CALeDNA:
16S: min size 60, max size 400 DOWNLOAD HERE
18S: min size 80, max size 550 DOWNLOAD HERE
PITS: min size 100, max size 800 DOWNLOAD HERE
CO1: min size 100, max size 700 DOWNLOAD HERE
FITS: min size 80, max size 700 DOWNLOAD HERE
trnL: min size 33, max size 225 DOWNLOAD HERE
Vertebrate 12S: min size 40, max size 150 DOWNLOAD HERE