SeqSero, launched in 2015, is a software tool for
serotype determination from whole-genome sequencing (WGS) data. Despite its routine use in public health and food safety laboratories in the United ...States and other countries, the original SeqSero pipeline is relatively slow (minutes per genome using sequencing reads), is not optimized for draft genome assemblies, and may assign multiple serotypes for a strain. Here, we present SeqSero2 (github.com/denglab/SeqSero2; denglab.info/SeqSero2), an algorithmic transformation and functional update of the original SeqSero. Major improvements include (i) additional sequence markers for identification of
species and subspecies and certain serotypes, (ii) a k-mer based algorithm for rapid serotype prediction from raw reads (seconds per genome) and improved serotype prediction from assemblies, and (iii) a targeted assembly approach for specific retrieval of serotype determinants from WGS for serotype prediction, new allele discovery, and prediction troubleshooting. Evaluated using 5,794 genomes representing 364 common U.S. serotypes, including 2,280 human isolates of 117 serotypes from the National Antimicrobial Resistance Monitoring System, SeqSero2 is up to 50 times faster than the original SeqSero while maintaining equivalent accuracy for raw reads and substantially improving accuracy for assemblies. SeqSero2 further suggested that 3% of the tested genomes contained reads from multiple serotypes, indicating a use for contamination detection. In addition to short reads, SeqSero2 demonstrated potential for accurate and rapid serotype prediction directly from long nanopore reads despite base call errors. Testing of 40 nanopore-sequenced genomes of 17 serotypes yielded a single H antigen misidentification.
Serotyping is the basis of public health surveillance of
It remains a first-line subtyping method even as surveillance continues to be transformed by whole-genome sequencing. SeqSero allows the integration of
serotyping into a whole-genome-sequencing-based laboratory workflow while maintaining continuity with the classic serotyping scheme. SeqSero2, informed by extensive testing and application of SeqSero in the United States and other countries, incorporates important improvements and updates that further strengthen its application in routine and large-scale surveillance of
by whole-genome sequencing.
Despite control efforts, salmonellosis continues to cause an estimated 1.2 million infections in the United States (US) annually. We describe the incidence of salmonellosis in the US and introduce a ...novel approach to examine the epidemiologic similarities and differences of individual serotypes.
Cases of salmonellosis in humans reported to the laboratory-based National Salmonella Surveillance System during 1996-2011 from US states were included. Coefficients of variation were used to describe distribution of incidence rates of common Salmonella serotypes by geographic region, age group and sex of patient, and month of sample isolation.
During 1996-2011, more than 600,000 Salmonella isolates from humans were reported, with an average annual incidence of 13.1 cases/100,000 persons. The annual reported rate of Salmonella infections did not decrease during the study period. The top five most commonly reported serotypes, Typhimurium, Enteritidis, Newport, Heidelberg, and Javiana, accounted for 62% of fully serotyped isolates. Coefficients of variation showed the most geographically concentrated serotypes were often clustered in Gulf Coast states and were also more frequently found to be increasing in incidence. Serotypes clustered in particular months, age groups, and sex were also identified and described.
Although overall incidence rates of Salmonella did not change over time, trends and epidemiological factors differed remarkably by serotype. A better understanding of Salmonella, facilitated by this comprehensive description of overall trends and unique characteristics of individual serotypes, will assist in responding to this disease and in planning and implementing prevention activities.
This supplement (no. 48) of the White–Kauffmann–Le Minor scheme reports on the characterization of 63 new Salmonella serovars and 25 new variants of previously described Salmonella serovars ...recognized by the WHO Collaborating Centre for Reference and Research on Salmonella between 2008 and 2010. Forty-four new serovars were assigned to Salmonella enterica subspecies enterica, 12 to subspecies salamae, two to subspecies arizonae, two to subspecies diarizonae and three to subspecies houtenae. All these new serovars or new variants are described with their multilocus sequence type.
Serotyping forms the basis of national and international surveillance networks for Salmonella, one of the most prevalent foodborne pathogens worldwide (1-3). Public health microbiology is currently ...being transformed by whole-genome sequencing (WGS), which opens the door to serotype determination using WGS data. SeqSero (www.denglab.info/SeqSero) is a novel Web-based tool for determining Salmonella serotypes using high-throughput genome sequencing data. SeqSero is based on curated databases of Salmonella serotype determinants (rfb gene cluster, fliC and fljB alleles) and is predicted to determine serotype rapidly and accurately for nearly the full spectrum of Salmonella serotypes (more than 2,300 serotypes), from both raw sequencing reads and genome assemblies. The performance of SeqSero was evaluated by testing (i) raw reads from genomes of 308 Salmonella isolates of known serotype; (ii) raw reads from genomes of 3,306 Salmonella isolates sequenced and made publicly available by GenomeTrakr, a U.S. national monitoring network operated by the Food and Drug Administration; and (iii) 354 other publicly available draft or complete Salmonella genomes. We also demonstrated Salmonella serotype determination from raw sequencing reads of fecal metagenomes from mice orally infected with this pathogen. SeqSero can help to maintain the well-established utility of Salmonella serotyping when integrated into a platform of WGS-based pathogen subtyping and characterization.
Increasingly, routine surveillance and monitoring of foodborne pathogens using whole-genome sequencing is creating opportunities to study foodborne illness epidemiology beyond routine outbreak ...investigations and case-control studies. Using a global phylogeny of Salmonella enterica serotype Typhimurium, we found that major livestock sources of the pathogen in the United States can be predicted through whole-genome sequencing data. Relatively steady rates of sequence divergence in livestock lineages enabled the inference of their recent origins. Elevated accumulation of lineage-specific pseudogenes after divergence from generalist populations and possible metabolic acclimation in a representative swine isolate indicates possible emergence of host adaptation. We developed and retrospectively applied a machine learning Random Forest classifier for genomic source prediction of Salmonella Typhimurium that correctly attributed 7 of 8 major zoonotic outbreaks in the United States during 1998-2013. We further identified 50 key genetic features that were sufficient for robust livestock source prediction.
Antimicrobial resistant Salmonella enterica serovar Concord (S. Concord) is known to cause severe gastrointestinal and bloodstream infections in patients from Ethiopia and Ethiopian adoptees, and ...occasional records exist of S. Concord linked to other countries. The evolution and geographical distribution of S. Concord remained unclear. Here, we provide a genomic overview of the population structure and antimicrobial resistance (AMR) of S. Concord by analysing genomes from 284 historical and contemporary isolates obtained between 1944 and 2022 across the globe. We demonstrate that S. Concord is a polyphyletic serovar distributed among three Salmonella super-lineages. Super-lineage A is composed of eight S. Concord lineages, of which four are associated with multiple countries and low levels of AMR. Other lineages are restricted to Ethiopia and horizontally acquired resistance to most antimicrobials used for treating invasive Salmonella infections in low- and middle-income countries. By reconstructing complete genomes for 10 representative strains, we demonstrate the presence of AMR markers integrated in structurally diverse IncHI2 and IncA/C2 plasmids, and/or the chromosome. Molecular surveillance of pathogens such as S. Concord supports the understanding of AMR and the multi-sector response to the global AMR threat. This study provides a comprehensive baseline data set essential for future molecular surveillance.
This supplement reports the characterization of 70 new
Salmonella serovars recognized between 2003 and 2007 by the WHO Collaborating Center for Reference and Research on
Salmonella: 44 were assigned ...to
Salmonella enterica subspecies
enterica, 11 to subspecies
salamae, 5 to subspecies
arizonae, 8 to subspecies
diarizonae, one to subspecies
houtenae and one to
Salmonella bongori. One new serovar, Mygdal, displayed a new H factor, H:z
91.
Salmonella enterica serotype Enteritidis is one of the most commonly reported causes of human salmonellosis. Its low genetic diversity, measured by fingerprinting methods, has made subtyping a ...challenge. We used whole-genome sequencing to characterize 125 S. enterica Enteritidis and 3 S. enterica serotype Nitra strains. Single-nucleotide polymorphisms were filtered to identify 4,887 reliable loci that distinguished all isolates from each other. Our whole-genome single-nucleotide polymorphism typing approach was robust for S. enterica Enteritidis subtyping with combined data for different strains from 2 different sequencing platforms. Five major genetic lineages were recognized, which revealed possible patterns of geographic and epidemiologic distribution. Analyses on the population dynamics and evolutionary history estimated that major lineages emerged during the 17th-18th centuries and diversified during the 1920s and 1950s.
A retrospective investigation was performed to evaluate whole-genome sequencing as a benchmark for comparing molecular subtyping methods for Salmonella enterica serotype Enteritidis and survey the ...population structure of commonly encountered S. enterica serotype Enteritidis outbreak isolates in the United States. A total of 52 S. enterica serotype Enteritidis isolates representing 16 major outbreaks and three sporadic cases collected between 2001 and 2012 were sequenced and subjected to subtyping by four different methods: (i) whole-genome single-nucleotide-polymorphism typing (WGST), (ii) multiple-locus variable-number tandem-repeat (VNTR) analysis (MLVA), (iii) clustered regularly interspaced short palindromic repeats combined with multi-virulence-locus sequence typing (CRISPR-MVLST), and (iv) pulsed-field gel electrophoresis (PFGE). WGST resolved all outbreak clusters and provided useful robust phylogenetic inference results with high epidemiological correlation. While both MLVA and CRISPR-MVLST yielded higher discriminatory power than PFGE, MLVA outperformed the other methods in delineating outbreak clusters whereas CRISPR-MVLST showed the potential to trace major lineages and ecological origins of S. enterica serotype Enteritidis. Our results suggested that whole-genome sequencing makes a viable platform for the evaluation and benchmarking of molecular subtyping methods.
Serotyping of Salmonella has been an invaluable subtyping method for epidemiologic studies for more than 70 years. The technical difficulties of serotyping, primarily in antiserum production and ...quality control, can be overcome with modern molecular methods. We developed a DNA-based assay targeting the genes encoding the flagellar antigens (fliC and fljB) of the Kauffmann-White serotyping scheme. Fifteen H antigens (H:a, -b, -c, -d, -d/j, -e,h, -i, -k, -r, -y, -z, -z₁₀, -z₂₉, -z₃₅, and -z₆), 5 complex major antigens (H:G, -EN, -Z4, -1, and -L) and 16 complex secondary antigens (H:2, -5, -6, -7, -f, -m/g,m, -m/m,t, -p, -s, -t/m,t, -v, -x, -z₁₅, -z₂₄, -z₂₈, and -z₅₁) were targeted in the assay. DNA probes targeting these antigens were designed and evaluated on 500 isolates tested in parallel with traditional serotyping methods. The assay correctly identified 461 (92.2%) isolates based on the 36 antigens detected in the assay. Among the isolates considered correctly identified, 47 (9.4%) were partially serotyped because probes corresponding to some antigens in the strains were not in the assay, and 13 (2.6%) were monophasic or nonmotile strains that possessed flagellar antigen genes that were not expressed but were detected in the assay. The 39 (7.8%) strains that were not correctly identified possessed an antigen that should have been detected by the assay but was not. Apparent false-negative results may be attributed to allelic divergence. The molecular assay provided results that paralleled traditional methods with a much greater throughput, while maintaining the integrity of the Kauffmann-White serotyping scheme, thus providing backwards-compatible epidemiologic data. This assay should greatly enhance the ability of clinical and public health laboratories to serotype SALMONELLA: