TY - DATA T1 - Data and code underlying the publication: Emergence of novel SARS-CoV-2 variants in the Netherlands PY - 2025/04/07 AU - Aysun Urhan AU - Thomas Abeel UR - DO - 10.4121/11bff1ea-4784-463e-90d0-eb2e2b64fe96.v1 KW - bioinformatics KW - genomics KW - genome data KW - sars-cov-2 KW - virus genomics KW - microorganisms KW - viral genomics N2 -
SARS-CoV-2 genome dataset to accompany our publication: Emergence of novel SARS-CoV-2 variants in the Netherlands from Scientific Reports [1].
Complete, high quality (number of undetermined bases less than 1% of the whole sequence) genome sequences of SARS-COV-2 that were isolated from human hosts only were obtained from GISAID, NCBI and China’s National Genomics Data Center (NGDC) on June 13th 2020. The dataset contained 29,503 sequences with unique identifiers in total, including the Wuhan-Hu-1 reference sequence (accession ID NC_045512.2). The “Collection date” field was also extracted for all sequences, and it is referred to as “date” throughout our work.
The acknowledgement table for GISAID sequences can be found in Supplementary file 2 and the full list of sequence identifiers for NCBI and NGDC records are provided in Supplementary file 3 in the corresponding publication [1].
[1]: Urhan, A., Abeel, T. Emergence of novel SARS-CoV-2 variants in the Netherlands. Sci Rep 11, 6625 (2021). https://doi.org/10.1038/s41598-021-85363-7
ER -