Reference Databases for Metabarcoding
Metabarcoding Reference Databases
These databases were made using the CRUX Pipeline, part of the Anacapa Toolkit (Curd et al., in MEE).
We update these databases annually. If you are using a different primer or locus, we encourage you to make your own CRUX database. For collaborators and occasionally people who ask nicely, we can make them for you as the CRUX method is cumbersome to get working.
These were made using select groups in the STD database from EMBL as the seed sequences for EcoPCR. Several collaborators are exploring alternatives, such as using the entire SRA as seed sequences, which would boost sequence inclusion such as prokaryotic draft genomes that are more common online than Sanger-based barcoding loci. We encourage you to share your experiences, explorations, and preferences with us.
Link to batch download the CRUX databases for CALeDNA: https://ucla.box.com/s/leg3czpnskxwi859frpnc1s8dsb054bk
16S: min size 60, max size 400
18S: min size 80, max size 550
PITS: min size 100, max size 800
CO1: min size 100, max size 700
FITS: min size 80, max size 700