Database
Currently, the application has 3 types of databases: Sequence, Taxonomy and kraken2. Note that, if the user chooses the Sequence database, the user must also select the Taxonomy database, otherwise the data analysis will be incorrect.
For Sequence database, we have the following database types:
| Name database | File | size | Link download | Description |
|---|---|---|---|---|
| 2022.10.seqs.fna.qza | 2022.10.seqs.fna.qza | 286.9 MB | https://ftp.microbio.me/greengenes_release/2022.10/ | The name of the file we're going to download is "2022.10.taxonomy.asv.nwk.qza", which means it is "taxonomy" data with the feature IDs represented as the actual amplicon sequence variants; |
| Grenegenes2022.10.backbone.full-length.fna.qza | 2022.10.backbone.full-length.fna.qza | 61.7 MB | https://ftp.microbio.me/greengenes_release/2022.10/ | Greengenes2 contains over 20,000,000 16S rRNA V4 amplicon sequencing fragments, derived from a dizzying collection of public and private microbiome samples in Qiita, representing a very large cross section of environment types" |
| ref-97_otus.qza | 28.3 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. | |
| sh_refs_qiime_ver9_97_29.11.2022.qza | sh_qiime_release_29.11.2022.tgz | 12.5 MB | https://doi.plutof.ut.ee/doi/10.15156/BIO/2483915 | QIIME is a bioinformatics data science platform, originally developed for analysis of high-throughput microbiome marker gene (e.g., 16S or 18S rRNA genes) amplicon sequencing data. There have been two major versions of the QIIME platform, QIIME 1 and QIIME 2. |
| sh_refs_qiime_ver9_99_29.11.2022.qza | sh_qiime_release_29.11.2022.tgz | 17.8 MB | https://doi.plutof.ut.ee/doi/10.15156/BIO/2483915 | QIIME is a bioinformatics data science platform, originally developed for analysis of high-throughput microbiome marker gene (e.g., 16S or 18S rRNA genes) amplicon sequencing data. There have been two major versions of the QIIME platform, QIIME 1 and QIIME 2. |
| sh_refs_qiime_ver9_dynamic_29.11.2022.qza | sh_qiime_release_29.11.2022.tgz | 15.9 MB | https://doi.plutof.ut.ee/doi/10.15156/BIO/2483915 | QIIME is a bioinformatics data science platform, originally developed for analysis of high-throughput microbiome marker gene (e.g., 16S or 18S rRNA genes) amplicon sequencing data. There have been two major versions of the QIIME platform, QIIME 1 and QIIME 2. |
| silva-138-99-seqs-515-806.qza | Greengenes 13_8 SEPP reference database | 13.9 MB | https://docs.qiime2.org/2022.8/data-resources/#taxonomy-classifiers-for-use-with-q2-feature-classifier%20silva:%20https://www.arb-silva.de/download/archive/qiime%20silva%20104%20silva%20108%20silva%20111%20silva%20119%20silva%20123%20silva%20128%20silva%20132 | The SSU Ref NR 99 138.1 dataset is based on the full SSU Ref 138.1 dataset , in total encompassing 510,508 sequences. By applying a 99% identity criterion to remove highly similar sequences using the open external link in new window vsearch tool with a custom sequence order first based on presence in the last release's Ref NR 99 and second based on combination of sequence length (weighted twofold) and quality. |
| silva-138-99-seqs.qza | Greengenes 13_8 SEPP reference database | 92.6 MB | https://docs.qiime2.org/2022.8/data-resources/#taxonomy-classifiers-for-use-with-q2-feature-classifier%20silva:%20https://www.arb-silva.de/download/archive/qiime%20silva%20104%20silva%20108%20silva%20111%20silva%20119%20silva%20123%20silva%20128%20silva%20132 | The SSU Ref NR 99 138.1 dataset is based on the full SSU Ref 138.1 dataset , in total encompassing 510,508 sequences. By applying a 99% identity criterion to remove highly similar sequences using the open external link in new window vsearch tool with a custom sequence order first based on presence in the last release's Ref NR 99 and second based on combination of sequence length (weighted twofold) and quality. For the sorting, the quality of a sequence is determined by ambiguities (50%), overall alignment quality (45%), and homopolymers (5%). |
| silva_132_90_16S_sequence.qza | Silva_132_release.zip | 10.8 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_90_18S_sequence.qza | Silva_132_release.zip | 13.2MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_94_16S_sequence.qza | Silva_132_release.zip | 10.8 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_94_18S_sequence.qza | Silva_132_release.zip | 5.5 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_97_16S_sequence.qza | Silva_132_release.zip | 47.3 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_99_16S_sequence.qza | Silva_132_release.zip | 89.9 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_99_18S_sequence.qza | Silva_132_release.zip | 15.5 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
For Taxonomy database, we have the following database types:
| Name database | File | size | Link download | Description |
|---|---|---|---|---|
| 2022.10.backbone.tax.qza | 2022.10.backbone.tax.qza | 4.5 MB | https://ftp.microbio.me/greengenes_release/2022.10/ | The Greengenes database full redesigned from the ground up, backed by whole genomes, with a focus on harmonizing 16S rRNA and shotgun metagenomic datasets. |
| Grenegenes2022.10.taxonomy.md5.tsv.qza | 2022.10.taxonomy.md5.tsv.qza | 424.3 MB | https://ftp.microbio.me/greengenes_release/2022.10/ | The Greengenes database full redesigned from the ground up, backed by whole genomes, with a focus on harmonizing 16S rRNA and shotgun metagenomic datasets. |
| ref-taxonomy.qza | 1.2 MB | https://www.arb-silva.de/download/archive/qiime | QIIME 2 is a software platform used for microbiome analysis, particularly for processing and analyzing DNA sequence data derived from microbial communities. | |
| sh_taxonomy_qiime_ver9_97_29.11.2022.qza | sh_qiime_release_29.11.2022.tgz | 1.9 MB | https://doi.plutof.ut.ee/doi/10.15156/BIO/2483915 | QIIME 2 is a software platform used for microbiome analysis, particularly for processing and analyzing DNA sequence data derived from microbial communities. |
| sh_taxonomy_qiime_ver9_99_29.11.2022.qza | sh_qiime_release_29.11.2022.tgz | 3 MB | https://doi.plutof.ut.ee/doi/10.15156/BIO/2483915 | QIIME 2 is a software platform used for microbiome analysis, particularly for processing and analyzing DNA sequence data derived from microbial communities. |
| sh_taxonomy_qiime_ver9_dynamic_29.11.2022.qza | 2.7 MB | https://doi.plutof.ut.ee/doi/10.15156/BIO/2483915 | QIIME 2 is a software platform used for microbiome analysis, particularly for processing and analyzing DNA sequence data derived from microbial communities. | |
| silva-138-99-tax-515-806.qza | Greengenes 13_8 SEPP reference database | 5.3 MB | https://docs.qiime2.org/2022.8/data-resources/#taxonomy-classifiers-for-use-with-q2-feature-classifier%20silva:%20https://www.arb-silva.de/download/archive/qiime%20silva%20104%20silva%20108%20silva%20111%20silva%20119%20silva%20123%20silva%20128%20silva%20132 | The SSU Ref NR 99 138.1 dataset is based on the full SSU Ref 138.1 dataset , in total encompassing 510,508 sequences. By applying a 99% identity criterion to remove highly similar sequences using the open external link in new window vsearch tool with a custom sequence order first based on presence in the last release's Ref NR 99 and second based on combination of sequence length (weighted twofold) and quality. For the sorting, the quality of a sequence is determined by ambiguities (50%), overall alignment quality (45%), and homopolymers (5%). |
| silva_132_90_16S_taxonomy.qza | Silva_132_release.zip | 856.1 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_90_18S_taxonomy.qza | Silva_132_release.zip | 286.9 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_94_18S_taxonomy.qza | Silva_132_release.zip | 286.9 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_97_16S_taxonomy.qza | Silva_132_release.zip | 4.1 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_97_18S_taxonomy.qza | Silva_132_release.zip | 848.6 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_99_16S_taxonomy.qza | Silva_132_release.zip | 8.6 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| silva_132_99_18S_taxonomy.qza | Silva_132_release.zip | 1.6 MB | https://www.arb-silva.de/download/archive/qiime | It is a single sequence. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
For kraken2 database, we have the following database types:
| Name database | File | size | Link download | Description |
|---|---|---|---|---|
| 16S_Greengenes13.5_20200326.tgz | Greengenes 13.5.tar.gz | 73.2 MB | https://benlangmead.github.io/aws-indexes/k2 | It is a single Kraken 2 database. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| 16S_RDP11.5_20200326.tgz | 16S_RDP11.5_20200326.tgz | 167.9 MB | https://benlangmead.github.io/aws-indexes/k2 | All packages contain a Kraken 2 database along with Bracken databases built for 100mers, 150mers, and 200mers. Kraken 2 is a fast and memory efficient tool for taxonomic assignment of metagenomics sequencing reads. Bracken is a related tool that additionally estimates |
| 16S_Silva132_20200326.tgz | Silva_132_release.zip | 116.9 MB | https://www.arb-silva.de/download/archive/qiime | It is a single Kraken 2 database. SILVA is the database for both small subunit (SSU; 16S/18S) and large subunit (LSU; 23S/28S) ribosomal RNA (rRNA) sequences. When preparing the database compatible with SILVA 132 QIIME, full-length 16S and 18S rRNA sequences - each labeled as belonging to a specific taxonomic unit - were downloaded from SILVA. |
| 16S_Silva138_20200326.tgz | Greengenes 13_8 SEPP reference database | 112.5 MB | https://docs.qiime2.org/2022.8/data-resources/#taxonomy-classifiers-for-use-with-q2-feature-classifier%20silva:%20https://www.arb-silva.de/download/archive/qiime%20silva%20104%20silva%20108%20silva%20111%20silva%20119%20silva%20123%20silva%20128%20silva%20132 | The SSU Ref NR 99 138.1 dataset is based on the full SSU Ref 138.1 dataset , in total encompassing 510,508 sequences. By applying a 99% identity criterion to remove highly similar sequences using the open external link in new window vsearch tool with a custom sequence order first based on presence in the last release's Ref NR 99 and second based on combination of sequence length (weighted twofold) and quality. For the sorting, the quality of a sequence is determined by ambiguities (50%), overall alignment quality (45%), and homopolymers (5%). |