Main Database Files
File |
Description |
KSGP_v1.0.fasta |
Version 1 of KSGP database. Contains cleaned GTDB SSU sequences; and Eukaryote sequences from PR2 with their original annotations combined with reannotated SSU rRNA sequences from both Karst et al and Archaea 16S sequences from SILVA. Please note that the taxonomic hierarchy used by PR2 is not compatible with that used by SILVA or NCBI. |
KSGP_v1.0.tax |
LotuS2 tax file for version 1.0 of KSGP database |
KSGP_v1.0.tar.gz |
Complete KSGP v1.0 database |
Auxilliary Files
File |
Description |
GTDB_plus_v1.0.fasta |
Version 1 of GTDB+ database (cleaned GTDB plus eukaryotes from PR2). Please note that the taxonomic hierarchy used by PR2 is not compatible with that used by SILVA or NCBI. |
GTDB_plus_v1.0.tax |
LotuS2 tax file for version 1.0 of GTDB+ database |
GTDB_cleaned_v1.0.fasta |
Cleaned and deduplicated GTDB Fasta File with domain level misassignments removed – should be combined with a database of eukaryote 18S sequences, such as PR2 |
GTB_cleaned_v1.0.tax |
Corresponding LotuS2 Tax file |
GTDB_214_SSU_sequences_removed_as_wrong_domain.csv |
SSU sequences in GTDB assigned to a different domain by RDP Classify and removed. The file includes both the original GTDB classification and RDP Classify annotation. |
Data sources
GTDB+ contains sequences and taxonomic assignments from Version 214.0 of GTDB and version 5.0 or PR2. KSPG also contains sequences from SILVA SSURef NR99, version 138.1 and ENA accession GCA_900214305.