Reference Databases for Metabarcoding

Metabarcoding Reference Databases

These databases were made using the CRUX Pipeline, part of the Anacapa Toolkit (Curd et al., in MEE).

We update these databases annually. If you are using a different primer or locus, we encourage you to make your own CRUX database. For collaborators and occasionally people who ask nicely, we can make them for you as the CRUX method is cumbersome to get working.

These were made using select groups in the STD database from EMBL as the seed sequences for EcoPCR. Several collaborators are exploring alternatives, such as using the entire SRA as seed sequences, which would boost sequence inclusion such as prokaryotic draft genomes that are more common online than Sanger-based barcoding loci. We encourage you to share your experiences, explorations, and preferences with us.

These are the 5 loci always used in CALeDNA that the link below provides reference databases for.

These are the 5 loci always used in CALeDNA that the link below provides reference databases for.

Link to batch download the CRUX databases for CALeDNA: https://ucla.box.com/s/leg3czpnskxwi859frpnc1s8dsb054bk

Settings:

16S: min size 60, max size 400

18S: min size 80, max size 550

PITS: min size 100, max size 800

CO1: min size 100, max size 700

FITS: min size 80, max size 700

Screen Shot 2019-10-09 at 11.45.32 AM.png

For detailed information on how CRUX works, please check out the MEE paper by Curd et al. (2019).