Viewport Size Code:
Login | Create New Account
picture

  MENU

About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot

icon

Bibliography Options Menu

icon
QUERY RUN:
HITS:
PAGE OPTIONS:
Hide Abstracts   |   Hide Additional Links
NOTE:
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.

More About:  ESP | OUR CONTENT | THIS WEBSITE | WHAT'S NEW | WHAT'S HOT

ESP: PubMed Auto Bibliography 07 Feb 2026 at 01:33 Created: 

Pangenome

Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: ( pangenome[TIAB] OR "pan-genome"[TIAB] OR "pan genome"[TIAB] ) NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

-->

RevDate: 2026-02-06

Dzaraly ND, Muthanna AR, John J, et al (2026)

Comparative Whole Genome Sequencing of Seven Invasive Streptococcus pneumoniae Isolates from Malaysia Reveals Genetic Diversity, Recombination events, and Global Lineage Linkages.

Journal of applied microbiology pii:8466407 [Epub ahead of print].

BACKGROUND: Streptococcus pneumoniae remains a major global health threat, causing diseases ranging from mild respiratory infections to severe conditions like pneumonia, sepsis, and meningitis. Although pneumococcal conjugate vaccines (PCVs) including PCV7, PCV10, and PCV13 have significantly reduced disease burden, especially in children, S. pneumoniae continues to exhibit high serotype and genetic diversity. Whole genome sequencing (WGS) analysis offers high-resolution insights into clonal lineages and multidrug-resistant strains. However, genomic data on Malaysian isolates remain limited.

METHODS: This study characterised the whole genome features and comparative profiles of seven invasive S. pneumoniae isolates from two tertiary hospitals in Malaysia. WGS analyses described serotype, sequence type (ST), antimicrobial resistance determinant genes, pan-genome structure, and recombination events.

RESULTS: The average genome size was ∼2.12 Mbp, with 1 988-2 205 coding sequences. WGS-based MLST identified five sequence types (ST236, ST320, ST386, ST671, ST695), with ST236 linked to serotypes 19A and 19F related to PMEN clones Taiwan19F-14 and CC271. Core genome analysis with 35 global reference strains revealed three major clades. Notably, isolates TSP95, SSP45, and SSP46 clustered closely with strains from South Korea, suggesting a long-term persistence of ST320 over a decade. Recombination analysis identified both shared and isolate-specific events, forming distinct phylogenetic clusters. Extensive shared recombination was observed in several isolates, while others displayed isolate-specific events, indicating ongoing genetic diversification.

CONCLUSION: These findings underscore the critical role of recombination in shaping pneumococcal population structure, evolution, and adaptation.

RevDate: 2026-02-06

Selleri E, Tarracchini C, Petraro S, et al (2026)

Assessment of genome evolution in Bifidobacterium adolescentis indicates genetic adaptation to the human gut.

mSystems [Epub ahead of print].

UNLABELLED: Bifidobacterium adolescentis is one of the most frequently encountered bifidobacterial species present in the adult human gut microbiota, with a prevalence of approximately 60%. Despite its high prevalence, B. adolescentis has not been extensively studied and characterized, and our understanding of its physiological traits, genetic diversity, and potential interactions with other members of the human gut microbiota or with its host is therefore fragmentary. In the current study, a data set comprising 1,682 B. adolescentis genomes was compiled by combining publicly available data and metagenome assemblies from 131 projects to uncover the unique genetic characteristics of this species. A pangenome analysis of B. adolescentis identified 203 clusters of orthologous genes absent from the other five human-associated Bifidobacterium species, six of which were in silico predicted to encode functions unique to this taxon. Furthermore, 2,597 genes were predicted to have been acquired by horizontal gene transfer, including genes encoding extracellular structures involved in interaction with the host and other microorganisms, and phage defense mechanisms against bacteriophages. Detailed phylogenetic analysis revealed seven clusters within the B. adolescentis species, each partially associated with the origin of strain isolation, suggesting phylogenetic differentiation shaped by geographical strain origin. Moreover, a large-scale metagenomic analysis of over 10,000 human gut metagenomes from healthy adults revealed that B. adolescentis co-occurs with 36 putative beneficial commensals and butyrate-producing taxa, highlighting its role as a key bifidobacterial species involved in microbial networking within the adult human gut microbiota.

IMPORTANCE: To comprehensively explore the biodiversity within a microbial species, the reconstruction of a substantial number of genomes is essential. In this study, we successfully uncovered the genetic diversity of Bifidobacterium adolescentis by retrieving a large number of genomes from human gut metagenomic samples. The complete overview of the B. adolescentis pangenome enabled us to investigate the genetic features that distinguish this gut commensal from other bifidobacterial species residing in the human intestinal microbiota.

RevDate: 2026-02-06
CmpDate: 2026-02-06

Versoza CJ, Bales KL, Jensen JD, et al (2026)

Characterizing the rates and patterns of de novo germline mutations in coppery titi monkeys (Plecturocebus cupreus).

bioRxiv : the preprint server for biology pii:2026.01.15.699688.

Although recent advances in genomics have enabled the high-resolution study of whole genomes, our understanding of one of the key evolutionary processes, mutation, still remains limited. In primates specifically, studies have largely focused on humans and their closest evolutionary relatives, the great apes, as well as a handful of species of biomedical or conservation interest. Yet, as biological variation in mutation rates has been shown to vary across genomic regions, individuals, and species, a greater understanding of the underlying evolutionary dynamics at play will ultimately be illuminated by not only additional sampling across the Order, but also by a greater depth of sampling within-species. To address these needs, we here present the first population-scale genomic resources for a platyrrhine of considerable biomedical interest for both social behavior and neurobiology, the coppery titi monkey (Plecturocebus cupreus). Deep whole-genome sequencing of 15 parent-offspring trios, together with a computational de novo mutation detection pipeline based on pan-genome graphs, has provided a detailed picture of the sex-averaged mutation rate - 0.63 × 10 [-8] (95% CI: 0.43 × 10 [-8] - 0.90 × 10 [-8]) per site per generation - as well as the effects of both sex and parental age on underlying rates, demonstrating a significant paternal age effect. Coppery titi monkey males exhibit long reproductive lifespans, afforded by long-term pair bonding in the species' monogamous mating system, and our results have demonstrated that individuals reproducing later in life exhibit one of the strongest male mutation biases observed in any non-human primate studied to date. Taken together, this study thus provides an important piece of the puzzle for better comprehending the mutational landscape across primates.

RevDate: 2026-02-06
CmpDate: 2026-02-06

Dockman RL, EA Ottesen (2026)

Niche specialization and cross-feeding interactions shaping gut microbial fiber degradation in a model omnivore.

bioRxiv : the preprint server for biology pii:2026.01.22.701066.

The gut microbiome plays an active role in host health, producing gut metabolites that influence host digestive and immune function while also mediating microbial crosstalk. Dietary fiber is a major source of important fermentation byproducts that are generally implicated in gut community stability and host wellbeing, but dissecting microbe-specific contributions to polysaccharide metabolism in the context of a complex gut community is challenging with conventional model organisms. Using the American cockroach (Periplaneta americana) as a model omnivore, we use chemically-defined synthetic diets to identify how complex gut microbial communities respond to two of the most abundant plant polysaccharides, xylan and cellulose. To do so, we fed cockroaches synthetic diets containing one of these fibers or a mix of both in differing ratios. Through both 16S rRNA gene profiling and RNA-seq, we show that mixed fibers enrich for organisms characteristic of the source fibers as well as additional organisms only enriched in mixed-fiber diets. Through an organism-centric pangenome approach, we identify the impact of these fibers on gut microbiome activity. We found that gut communities responded strongly to xylan, with Bacteroidota belonging to Bacteroides, Dysgonomonas, and Parabacteroides producing xylan-active CAZymes at high levels. Multiple groups of Bacillota also responded strongly to a xylan diet, but appeared to act as cross-feeding secondary degraders, producing primarily xylosidases and transcripts associated with xylose utilization. In contrast, cellulose diets were associated with higher transcriptional activity among Fibrobacterota, which are typically a minor component of the cockroach gut microbiome but were the primary producers of CAZymes associated with cellulose and cellobiose degradation. These experiments provide new insight into gut microbial metabolism of these complex plant polysaccharides. Further, they highlight the utility of the cockroach model and synthetic diets to answer fundamental questions about gut microbial responses to different polysaccharides alone and in combination.

RevDate: 2026-02-06
CmpDate: 2026-02-06

Sanaullah A, Brown NK, Shakya P, et al (2026)

RLBWT-Based LCP Computation in Compressed Space for Terabase-Scale Pangenome Analysis.

bioRxiv : the preprint server for biology pii:2026.01.23.701410.

UNLABELLED: Lossless full text indexes are utilized in a myriad of applications in bioinformatics. The continuously decreasing cost of generating biological data has resulted in the need to build full text indexes on biological datasets of increasing size. Many compressed full text indexes have been developed to address this problem. In particular, run-length Burrows-Wheeler transform (RLBWT) based compressed full text indexes have seen wide development and adoption. However, the construction of these RLBWT-based compressed full text indexes is still computationally expensive, sometimes prohibitively so, even for current dataset sizes. Therefore, we present algorithms for the construction of RLBWT-based compressed full text indexes and their supporting data structures in compressed space. The algorithms have a space complexity of O (r) words and run in O (n) time for repetitive datasets, where r is the number of runs in the BWT, n is the length of the text, and repetitive datasets implies the average run length is at least log n . We provide the first algorithm to compute LCP-related information for repetitive datasets in optimal time and O (r) space, greatly reducing memory requirements. The key idea behind this algorithm is the utilization of r samples of the inverse suffix array at regular intervals. For example, on the Human Pangenome Reference Consortium Release 2 dataset, this reduces peak memory from 2,135 GiB to 170 GiB (12.6x reduction) compared to the previous best method (pfp-thresholds).

AVAILABILITY: The implementation is available at https://github.com/ucfcbb/TeraTools .

SUPPLEMENTARY INFORMATION: Supplementary Material is available online at bioRxiv.

RevDate: 2026-02-06
CmpDate: 2026-02-06

Yang Q, Wang P, Yang X, et al (2025)

Evolution and phylogenetic characteristics of the first Brucella canis strain isolated from a human patient in Yunnan Province, China.

Frontiers in cellular and infection microbiology, 15:1743711.

INTRODUCTION: Brucella canis is a zoonotic pathogen that infects both dogs and humans, yet its evolutionary and phylogenetic characteristics are poorly understood.

METHODS: Here, we comprehensively characterized an isolated strain of B. canis through integrated bacteriological, comparative genomic, and whole-genome sequencing-based core genome single-nucleotide polymorphism (WGS-cgSNP) analyses.

RESULTS: B. canis YN20042 was isolated from a febrile patient (38 °C) with sweating and fatigue. The culture exhibited rough, grayish white, sticky, and opaque colonies. The isolate was identified as Brucella strain by a BCSP-31 polymerase chain reaction (PCR) assay, which yielded an amplicon of the expected 223-bp size, and was classified as a B. canis strain by conventional biotyping. The patient reported frequent contact with dogs and livestock. The strain showed a 99.99% average nucleotide identity to the B. canis reference strain ATCC 23365 (GCA_000018525.1). An in silico multilocus sequence typing (MLST) analysis showed that the strain belonged to sequence type 21, which was consistent with its classification within B. canis. The genome of strain YN20042 exhibited strong synteny with the reference strain and showed no detectable structural variations. It harbored 12 predicted virulence factors encompassing 71 associated genes, although it notably lacked the wbpL gene but contained a Brucella suis mprF gene. A further analysis identified predicted mutations in key virulence genes (eryA, pagN, bmaC, cfa1, and cfa2) and predicted multiple horizontally acquired genes, collectively suggesting a complex evolutionary trajectory involving both gene variants and potential recombination events. A WGS-SNP analysis revealed that YN20042 clustered closely with strains isolated from Zhejiang and Beijing, indicating a high degree of genetic relatedness.

CONCLUSION: The first isolation of B. canis in the region expands the local spectrum of pathogenic Brucella and highlights the substantial infection risk for individuals with close dog and livestock contact. Enhanced surveillance, targeted screening of high-risk populations, and public health education are necessary to mitigate the risk of B. canis transmission.

RevDate: 2026-02-06
CmpDate: 2026-02-06

Eskandar P, Paten B, J Sirén (2026)

Lossless Pangenome Indexing Using Tag Arrays.

Research square pii:rs.3.rs-8233501.

Pangenome graphs represent the genomic variation by encoding multiple haplotypes within a unified graph structure. However, efficient and lossless indexing of such structures remains challenging due to the scale and complexity of pangenomic data. We present a practical and scalable indexing framework based on tag arrays, which annotate positions in the Burrows--Wheeler transform (BWT) with graph coordinates. Our method extends the FM-index with a run-length compressed tag structure that enables efficient retrieval of all unique graph locations where a query pattern appears. We introduce a novel construction algorithm that combines unique $k$-mers, graph-based extensions, and haplotype traversal to compute the tag array in a memory-efficient manner. To support large genomes, we process each chromosome independently and then merge the results into a unified index using properties of the multi-string BWT and r-index. Our evaluation on the HPRC graphs demonstrates that the tag array structure compresses effectively, scales well with added haplotypes, and preserves accurate mapping information across diverse regions of the genome. This indexing method enables lossless and haplotype-aware querying in complex pangenomes and offers a practical indexing layer to develop scalable aligners and downstream graph-based analysis tools. The index additionally supports efficient one-to-all coordinate translation, enabling any interval on a haplotype to be mapped to its corresponding intervals across all other haplotypes in the graph.

RevDate: 2026-02-05

Savin M, Erler T, Carlsen L, et al (2026)

Cefiderocol-resistant Aeromonas with expanded Resistomes in German hospital wastewater: Phenotypic and genomic evidence from the environment-clinical Interface.

The Science of the total environment, 1017:181478 pii:S0048-9697(26)00138-5 [Epub ahead of print].

Hospital wastewater is a key interface between clinical and environmental reservoirs of antimicrobial resistance, fostering selection and horizontal gene transfer. Aeromonas spp. are aquatic opportunistic pathogens with highly plastic genomes and are increasingly recognized as potential intermediaries in resistance dissemination. We compared 72 cefiderocol-selected Aeromonas isolates recovered from untreated hospital wastewater collected at six tertiary care hospitals across Germany with 62 clinical isolates from patients with intestinal and extraintestinal infections, to characterize cefiderocol susceptibility, resistome composition, and genomic mobility features. Pangenome analysis revealed an open genome structure comprising 21,364 gene clusters, with a core genome of 2486 genes and a large cloud gene pool (15,612 clusters present in <15% of isolates), highlighting extensive genomic plasticity. Resistance phenotypes diverged markedly: cefiderocol-selected wastewater isolates exhibited high resistance rates to multiple clinically relevant agents - ciprofloxacin (93.1%), aztreonam (81.2%), and trimethoprim-sulfamethoxazole (38.9%), whereas clinical isolates remained largely susceptible overall (<10%). Under iron limitation, siderophore production increased in both cohorts; however, in the presence of cefiderocol it remained robust in wastewater isolates while being suppressed in clinical isolates. Comparative genomics showed that wastewater isolates carried substantially expanded resistomes (mean 13.8 ARGs; range 2-27) relative to clinical isolates (mean 2.6; range 1-11), including enrichment of clinically relevant β-lactamases and carbapenemases. This resistance burden coincided with a larger and more transmissible plasmidome and a high insertion sequence load. Notably, extensive plasmid-backbone homology was detected between Aeromonas and co-occurring cefiderocol-resistant Enterobacterales isolated from the same wastewater samples, highlighting interspecies gene flow at the hospital-environment interface. Together, these findings identify hospital wastewater as a reservoir and convergence point for highly resistant, mobilome-enriched Aeromonas subpopulations captured under cefiderocol selection, supporting Aeromonas as a One Health sentinel and emphasizing the value of wastewater-based surveillance for tracking mobile resistance determinants bridging environmental and clinical compartments.

RevDate: 2026-02-05
CmpDate: 2026-02-05

Huang Y, Zhang Y, Zhang Q, et al (2026)

Multiscale pangenome graphs empower the genomic dissection of mixed-ploidy sugarcane species.

Science (New York, N.Y.), 391(6785):eadx1616.

The sugarcane genus Saccharum is characterized by complex genomes with diverse ploidy levels. We developed a multiscale graph-based pangenome representation, which integrates nine genome assemblies into a unified reference, representing modern cultivars and founding species. Each homo(eo)logous (encompasses both homologous and homeologous relationships) chromosome set retains 47 to 57 haplotypes and ~74,000 to 271,000 gene alleles. This framework enables multiomics exploration, encompassing homo(eo)log systems and epigenomic signatures. The pangenome facilitates population genomics analyses of 417 mixed-ploidy Saccharum accessions, revealing convergent selection and identifying the Andropogoneae TB1 homolog linked to tillering as a promising gene-editing target to boost cane yield. Additionally, the pangenome supports dosage-informed genome-wide association study, improving heritability estimates and identification of sugar or leaf-angle-associated loci, including SaIRX10 and SaBAK5. Our analytical framework establishes a foundation for graph-based genetic studies in sugarcane and other polyploid genomes.

RevDate: 2026-02-05
CmpDate: 2026-02-05

Ma W, Liu Y, Wei X, et al (2026)

Gapless pangenome analyses reveal fast Brassica rapa subspeciation.

Science (New York, N.Y.), 391(6785):eady7590.

Brassica rapa (Br) encompasses many morphotypes and subspecies, so it is a good model with which to investigate plant diversification and subspeciation. Here, we resequenced the genomes of 1720 Br accessions and de novo assembled 11 representative telomere-to-telomere gapless genomes for seven elite subspecies that underwent intensive morphotypification and developed distinct agronomic traits valued to agriculture. We identified 6992 unknown genes, 110 complete (peri)centromeres, and five new satellites associated with Br morphotypes and subspecies and Brassica species evolution. The pangenome, built on 11 gapless and 20 published genomes, reveals structural variations and gene diversities among Br subspecies. Pangenome-wide association studies uncovered that the gene BrLH1 controls leaf-head formation. We show that structural changes have occurred in satellites, (peri)centromeres, and genes, contributing to fast subspeciation and morphotypification during the short history of Br cultivation, providing invaluable resources for Brassica breeding.

RevDate: 2026-02-05
CmpDate: 2026-02-05

Wu Y, Jones JDG, X Lin (2026)

The Application of RenSeq for Pan-NLRome Analysis.

Methods in molecular biology (Clifton, N.J.), 3012:113-127.

Resistance gene enrichment sequencing (RenSeq) was developed in 2013. It has accelerated the cloning of plant NLR genes and has contributed to resistance breeding for multiple crop plants, such as potato, wheat, and rice. By combining with other strategies, many applications were developed, such as cDNA-RenSeq, dRenSeq, SMRT-RenSeq, RLP/KSeq, AgRenSeq, and MutRenSeq. These methods have been widely applied in different crops. In this protocol, we present a step-by-step guide for applying RenSeq in gene cloning and Pan-NLRome analysis. The protocol covers bait design, library preparation, target enrichment, and downstream bioinformatic analysis. This methodology can make RenSeq more accessible to researchers working with different crops and enhance our understanding of plant resistance genes in the age of pan-genome.

RevDate: 2026-02-05
CmpDate: 2026-02-05

Fang C, Zhou Z, Zhang X, et al (2026)

Epidemiological and Genomic Insights into Linezolid-Non-Susceptible Enterococci in Pediatric Patients.

Current microbiology, 83(3):155.

Enterococci are major opportunistic pathogens causing healthcare-associated infections in children. Linezolid, a WHO-designated critically important antibiotic for multidrug-resistant Gram-positive infections, is increasingly challenged by linezolid-non-susceptible enterococci (LNSE). Yet pediatric LNSE epidemiology and genomics data remain scarce, hindering targeted control. We analyzed 26 LNSE strains isolated from Children's Hospital, Zhejiang University School of Medicine (June 2020-July 2024) using MALDI-TOF MS, Vitek2 Compact, micro-broth dilution (for linezolid MIC), MLST, resistance/virulence gene detection, and pan-genome analysis (COG/KEGG annotation). Enterococcus faecalis (E. faecalis) dominated (23/26,88.5%) with ST16 as the major sequence type (ST) and four novel STs identified; all strains harbored optrA and fexA, with species-specific resistance/virulence gene profiles. The 23 E. faecalis strains exhibited an open pan-genome (b = 0.174725), indicating the possible existence of active horizontal gene transfer (HGT), with core, accessory, and unique genes showing distinct functional differentiation. These findings provide critical and robust empirical data to inform the development of targeted prevention and control strategies against LNSE in pediatric populations.

RevDate: 2026-02-03

Araújo MRB, Dos Santos LS, Viana MVC, et al (2026)

Comparative genomics and molecular characterization of a multidrug-resistant Corynebacterium glucuronolyticum isolated for the first time from the human genitourinary tract in Latin America.

Brazilian journal of microbiology : [publication of the Brazilian Society for Microbiology], 57(1):49.

UNLABELLED: Although Corynebacterium glucuronolyticum has been associated with human infections, its pathogenic potential remains poorly understood. Here, we describe the first case in Latin America of C. glucuronolyticum isolated from the human urogenital tract. The strain, designated IHP2022, was identified by MALDI-TOF MS (99% probability) and exhibited resistance to benzylpenicillin, clindamycin, and tetracycline, characterizing a multidrug-resistant (MDR) phenotype. Genomic analysis revealed a 2.88-Mb genome with 59.04% G + C content and no plasmids. Comparative genomic analysis, including 11 other publicly available genomes, demonstrated high genetic diversity and positioned IHP2022 close to strain p3-SID752 from the USA, suggesting a broad geographical distribution. The genome harbored multiple virulence and resistance genes, as well as a Type I-E CRISPR-Cas system. Functional annotation and pangenome analysis identified 4,027 gene families categorized into core, shell, and cloud components. By integrating phenotypic and genomic data, this study provides the first in-depth characterization of an MDR C. glucuronolyticum strain minimizing current knowledge gaps and contributing to a better understanding of its pathogenic potential.

SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s42770-025-01822-7.

RevDate: 2026-02-04
CmpDate: 2026-02-04

Park S, Lee H, Yook S, et al (2026)

Oceanimonas aquatica sp. nov. and Arenibacter flavimaris sp. nov., isolated from seawater.

International journal of systematic and evolutionary microbiology, 76(2):.

The novel strains CHS3-5[T] and M-2[T] were isolated from seawater collected near Suaeda japonica colonies on Seongmodo Island, Republic of Korea. Strain CHS3-5[T] was Gram-stain-negative, motile with flagella, rod-shaped, strictly aerobic and formed circular, convex, ivory-coloured colonies, while strain M-2[T] was Gram-stain-negative, motile by gliding, rod-shaped, strictly aerobic and formed circular, raised, dark yellow colonies. Based on 16S rRNA and draft genome analyses, strains CHS3-5[T] and M-2[T] were identified as members of the Oceanimonas and Arenibacter genera, respectively. Strain CHS3-5[T] grew at temperatures of 10-40 °C, pH 4.0-10.0 and in the presence of 2.0-11.0% NaCl, with optimal growth at 30 °C, pH 7.0 and 3.0% NaCl. Strain M-2[T] grew at temperatures of 15-40 °C, pH 6.0-9.0 and in the presence of 2.0-4.0% NaCl, with optimal growth at 30 °C, pH 7.0 and 3.0% NaCl. Both novel strains showed low genomic relatedness to their respective type species. The average nucleotide identity and digital DNA-DNA hybridization values were 84.5-85.7% and 26.5-34.7% for strain CHS3-5[T] and 76.6-85.9% and 18.6-30.3% for strain M-2[T], respectively, supporting their classification as novel species. We propose the names Oceanimonas aquatica sp. nov. (type strain CHS3-5[T]=KACC 23248[T]=TBRC 17651[T]) and Arenibacter flavimaris sp. nov. (type strain M-2[T]=KACC 23249[T]=TBRC 17650[T]) for these strains.

RevDate: 2026-02-04

Treen RZ, Gonzalez-Juarrero M, Jackson M, et al (2026)

Mycobacterium abscessus research: learning from challenges.

Journal of bacteriology [Epub ahead of print].

Mycobacterium abscessus (Mab), a rapidly growing mycobacterial species with intrinsic and acquired resistance to multiple antibiotics, is an emerging public health concern. The rise in clinical cases of treatment-refractory infections of M. abscessus has propelled its research toward novel therapeutic approaches. The number of publications entitled "Mycobacterium abscessus" has increased by ~300% over the last decade, of which the majority of studies exploring the fundamental biology and pathogenesis of Mab have used the reference strain ATCC19977. However, whole-genome sequence analyses, combined with transposon-seq based functional genomics, reveal an open pan-genome with significant variations in the essential genes across ATCC19977 and clinical isolates. These new discoveries demand a careful selection of strains and growth conditions in experimental design. In this minireview, we discuss these challenges and propose a framework for future M. abscessus studies in silico, including a new web-based resource for pangenome analysis, in vitro, and in animal models.

RevDate: 2026-02-03

Downing T (2026)

Approaches to Studying Virus Pangenome Variation Graphs.

Genomics, proteomics & bioinformatics pii:8456561 [Epub ahead of print].

Pangenome variation graphs (PVGs) allow for the representation of genetic diversity in a more nuanced way than traditional reference-based approaches. Here I focus on how PVGs are a powerful tool for studying genetic variation in viruses, offering insights into the complexities of viral quasispecies, mutation rates, and population dynamics. PVGs originated in human genomics and hold great promise for viral genomics. Previous work has been constrained by small sample sizes and gene-centric methods, whereas PVGs enable a more comprehensive approach to studying viral diversity. Large viral genome collections should be used to make PVGs, which offer significant advantages. Here, I outline accessible tools to achieve their construction. This spans PVG construction, PVG file formats, PVG manipulation and analysis, PVG visualisation, measuring PVG openness, and mapping reads to PVGs. Additionally, the development of PVG-specific formats for mutation representation and personalised PVGs that reflect specific research questions will further enhance PVG applications. Challenges remain, particularly in managing nested variants, optimising error detection, optimising k-mer/minimizer-based approaches for AT-rich genomes, incorporating long read sequencing data, and scalable visualisation approaches. Nevertheless, PVGs offer a new opportunity for viral population genomics, and a testing ground for tool development prior to application to larger eukaryotic genomes. These advances will enable more accurate and comprehensive detection of viral mutations, contributing to a deeper understanding of viral evolution and genotype-phenotype associations.

RevDate: 2026-02-03

Droc G, Giraud D, Belser C, et al (2026)

A Super-Pangenome for Cultivated Citrus Reveals Evolutive Features During the Allopatric Phase of Their Reticulate Evolution.

Plant biotechnology journal [Epub ahead of print].

The main genetic diversity observed in cultivated citrus results from a reticulate evolution involving four ancestral taxa whose radiation occurred in allopatry. In such context, GWAS analysis, genome diversity and transcriptomic studies will be significantly enhanced through pangenome approaches. We report the implementation of a super-pangenome for cultivated citrus, established with de novo assemblies of C. medica, C. reticulata and C. micrantha, released for the first time alongside a published chromosome-scale assembly of C. maxima. Repetitive element annotation revealed that half of each genome consisted of transposable elements or DNA-satellites. The new genome assemblies display strong synteny and collinearity, while discrepancies are observed with the C. maxima assembly. Resequencing information from 55 accessions helped to explore the intra- and interspecific diversity of the ancestral taxa and their relationships with horticultural groups. Diagnostic SNPs of the ancestral taxa revealed interspecific introgressions in several representative accessions of C. reticulata, C. maxima and C. medica as well as insights into the origin and phylogenomic structures of horticultural groups. PAV analysis revealed a gene whose absence or presence was specific to one of the ancestral taxa. Diagnostic PAV analysis uncovered a large chloroplastic introgression in C. medica chromosome 4. The analysis of the functional enrichment and species-specific adaptations in the citrus super-pangenome revealed distinct functional specialisations. This highlights the evolutionary paths that have shaped species, contributing to the diversity in the citrus super-pangenome while maintaining a shared foundation of essential biological processes. We established a Genome Hub, offering a platform for continuous genomic research.

RevDate: 2026-02-02
CmpDate: 2026-02-02

Orozco-Ochoa AK, Quiñones B, Lee BG, et al (2026)

Pangenomics of high-risk international clones in Acinetobacter baumannii identifies distinctive virulence and antimicrobial resistance profiles.

Archives of microbiology, 208(4):173.

The bacterial pathogen Acinetobacter baumannii is an opportunistic and nosocomial causative agent of multidrug resistant infections worldwide. The present study conducted comparative genomic analyses to identify relevant pathogenicity traits in A. baumannii strains from diverse clinical samples and geographical regions in Mexico. Pangenome analysis clustered the strains into four phylogenomic clades, comprising various international clones. Clades I and II strains, predominantly from blood and respiratory infections in the Central region, were significantly associated with the Latin American IC5 clone (P = 0.0002), whereas clade III strains, primarily from diverse samples in the Northwestern region, were significantly associated with the European IC2 clone (P = 0.0030). Virulence determinants implicated in adhesion (ompA, omp38), biofilm formation (pgaA-D, csuA/BABCDE), motility (pil, fim), regulatory systems (bfmRS, barAB, abaR/abaI), iron acquisition (bas, bau), and efflux pump-delivery systems (adeFGH) were identified among the A. baumannii strains, representing all clades and geographical regions. Analysis of intrinsic and acquired antimicrobial resistance revealed that clades I and II strains were significantly correlated with resistance to β-lactamases (blaADC-6, blaOXA-239, blaOXA-65), sulfonamides (sul2), and chloramphenicol (cmlB1) (P = 0.0001). Interestingly, clade III strains, predominantly from the agricultural Northwestern region, exhibited a significant association of broader resistance genes against aminoglycosides (aac(6')-Ib', aph(3')-Ia, armA, aadA), β-lactamases (blaTEM-4, blaADC-25, blaOXA-66), sulfonamides (sul1), tetracyclines (tetA), and macrolides (mphD, msrE) (P = 0.0001). Subsequent characterization of mobile genetic elements indicated genetic plasticity and potential transfer of antimicrobial resistance. Collectively, this fundamental information would enable the improvement of epidemiological surveillance and intervention strategies for A. baumannii.

RevDate: 2026-02-02
CmpDate: 2026-02-02

Unlu Celebi S, Yalcin S, Kurt Azap O, et al (2026)

Genomic insights into the genetic diversity, resistance determinants, and plasmid content of carbapenem-resistant Acinetobacter baumannii clinical isolates.

Archives of microbiology, 208(4):169.

Carbapenem-resistant Acinetobacter baumannii (CRAB) is a critical nosocomial pathogen with limited therapeutic options. This study aimed to describe clonal relationships among CRAB isolates and genomic insights from representative clusters. A total of 128 non-duplicate CRAB isolates were included in the study. Pulsed-field gel electrophoresis (PFGE) was used to assess clonal relationships and as a preliminary clustering tool for isolate selection. Twelve representative isolates from distinct PFGE clusters were selected for whole-genome sequencing using Oxford Nanopore. Genome assembly, annotation, and comparative analyses were performed using Flye, Prokka, and Roary, respectively. Antimicrobial resistance (AMR) genes, plasmids, insertion sequences, integrons and prophages were identified using the CARD, MOB-suite, ISEscan, IntegronFinder, and PHASTEST tools, respectively. Multilocus sequence typing (MLST) and pangenome analyses were conducted to determine genetic diversity and relatedness among the CRAB isolates. Antibiotic susceptibility testing revealed an extensively drug-resistant phenotype, colistin resistance rate was 23.4%. Mutations in lpxC, lpxD, pmrB, and lpxA were identified in colistin-resistant isolates, suggesting a possible role. Most isolates belonged to the globally disseminated clone ST2Pasteur, while others were classified as ST636 and ST78. Genomic comparisons identified diverse resistance genes, mobile genetic elements, plasmids, integrons, and virulence factors. Pangenome analysis uncovered a considerable genomic diversity, with 2700 core genes (42.5%) and 3649 accessory genes (57.5%), including 1864 strain-specific (cloud) genes (29.4%) among the isolates. Overall, our findings demonstrate the complex genomic architecture of CRAB and highlight the potential role of genomic surveillance in local infection control.

RevDate: 2026-02-02
CmpDate: 2026-02-02

Cole B, Zhang W, Shi J, et al (2026)

Multi-season analysis reveals hundreds of drought-responsive genes in sorghum.

The Plant journal : for cell and molecular biology, 125(3):e70657.

Persistent drought affects global crop production and is becoming more severe in many parts of the world in recent decades. Deciphering how plants respond to drought will facilitate the development of flexible mitigation strategies. Sorghum bicolor L. Moench (sorghum), a major cereal crop and an emerging bioenergy crop, exhibits remarkable resilience to drought. To better understand the molecular traits that underlie sorghum's remarkable drought tolerance, we undertook a large-scale sorghum gene expression profiling effort, totaling nearly 1500 transcriptome profiles, across a 3-year field study with replicated plots in California's Central Valley. This study included time-resolved gene expression data from roots and leaves of two sorghum genotypes, BTx642 and RTx430, with different pre-flowering and post-flowering drought-tolerance adaptations under control and drought conditions. Quantification of genotype-specific drought tolerance effects was enabled by de novo sequencing, assembly, and annotation of both BTx642 and RTx430 genomes. These reference-quality genomes were used to construct a pangene set for characterizing conserved and genotype-specific expression. By integrating time-resolved transcriptomic responses to drought in the field across three consecutive years, we identified a set of 726 drought-responsive genes that responded similarly in all 3 years of our field study. Functional enrichment analysis identified abiotic stress, secondary cell wall-related processes and metabolism as particularly affected under both types of drought stress. We also found that some glyoxylate cycle pathway genes, including malate synthase and isocitrate lyase, are differentially regulated particularly during post-flowering drought stress, implicating this pathway as potentially important for drought responsiveness. This expansive dataset represents a unique resource for sorghum and drought research communities and provides a methodological framework for the integration of multi-faceted time-resolved transcriptomic datasets.

RevDate: 2026-02-02
CmpDate: 2026-02-02

Nan H, Chen X, Zhang J, et al (2025)

Dualistic MADS-box evolution forged legume diversity post-WGD.

Frontiers in plant science, 16:1740598.

The MADS-box gene family plays a central role in plant development and adaptation, yet its evolutionary history in legumes is remarkably complex. In this study, we performed a pangenomic analysis across 52 legume species, identifying 4,872 MADS-box genes and reconstructing their phylogeny into 16 subfamilies. Our analysis uncovered a pervasive dualistic evolutionary model driven by distinct duplication mechanisms. Structurally, the genes fall into two categories: the compact, intron-poor Type I and the complex, intron-rich Type II. We demonstrate that whole-genome duplication (WGD) serves as the major driver (42.2%) behind the expansion of the conserved core genome, which includes key floral regulators such as the "ABCDE model" genes. These WGD-derived genes are under strong purifying selection, thereby ensuring developmental stability. In contrast, small-scale duplication (SSD) fuels the expansion of the dynamic periphery, primarily composed of Type I genes and stress-responsive clades, which evolve under relaxed selection and promote lineage-specific innovation-as strikingly exemplified by the massive tandem expansion of the SVP subfamily in Prosopis. Pangenome analysis confirmed that WGD-derived genes were enriched in the conserved core genome, underpinning essential functions, whereas SSD-derived genes dominated the variable genome and acted as a source of genetic novelty. Transcriptome analysis in soybean identified four organ-specific expression modules, predominantly comprising Type II core genes. Under biotic and abiotic stress, WGD-derived gene pairs exhibited prominent asymmetric expression. The expression divergence was validated by qRT-PCR. Overall, our findings establish a unified framework for MADS-box gene evolution in legumes, illustrating how divergent duplication mechanisms and selective pressures have collectively shaped a gene family critical to both evolutionary innovation and developmental stability.

RevDate: 2026-02-02
CmpDate: 2026-02-02

Brůna T, Sreedasyam A, Harder AM, et al (2026)

Evolutionary and methodological considerations when interpreting gene presence-absence variation in pangenomes.

NAR genomics and bioinformatics, 8(1):lqag011.

While graph-based pangenomes have become a standard and interoperable foundation for comparisons across multiple reference genomes, integrating protein-coding gene annotations across pangenomes in a single 'pangene set' remains challenging, both because of methodological inconsistency and biological presence-absence variation (PAV). Here, we review and experimentally evaluate the root of genome annotation and pangene set inconsistency using two polyploid plant pangenomes: cotton and soybean, which were chosen because of their existing diverse high-quality genomic resources and the known importance of gene PAV in their respective breeding programs. We first demonstrate that building pangene sets across different genome resources is highly error prone: PAV calculated directly from the genome annotations hosted on public repositories recapitulates structure in annotation methods and not biological sequence differences. Re-annotation of all genomes with a single identical pipeline largely resolves the broadest stroke issues; however, substantial challenges remain, including a surprisingly common case where exactly identical sequences have different gene model structural annotations. Combined, these results clearly show that pangenome gene model annotations must be carefully integrated before any biological inference can be made regarding sequence evolution, gene copy-number, or PAV.

RevDate: 2026-02-02
CmpDate: 2026-02-02

Yu C, Li W, Jiang Y, et al (2026)

Graph pan-genome advances genetic discoveries and the improvement of eggplant.

Horticulture research, 13(1):uhaf248.

Eggplant is one of the most important solanaceous vegetable crops worldwide. To explore its genomic diversity, we assembled two T2T-level reference genomes from the African eggplant 'Y11' (Solanum aethiopicum L.) and the cultivated variety 'Gui5' (Solanum melongena L.) with genome sizes of 1.10 and 1.13 Gb, respectively. The contigs N50 lengths are 94.2 and 93.9 Mb, with annotations of 37 324 and 40 300 protein-coding genes correspondingly. We also sequenced 238 germplasms, primarily local and cultivated varieties from China, Southeast Asia, Europe, and Africa, identifying 7 853 531 high-quality single nucleotide polymorphisms. Phylogenetic trees and population structures suggest that the domestication of Chinese eggplants occurred later than in Southeast Asia and subsequently diverged into northern and southern groups within China, evolving relatively independently with limited genetic flow between these two groups. Their diversity is significantly lower than that of Southeast Asia and Europe. By selecting 22 representative accessions and four chromosome-level genomes, we constructed an Asian-representative eggplant pan-genome, assembling 463.94 Mb of nonreference sequences. Of these sequences, 38.3% are core genes, 46.9% are dispensable genes, and 14.9% are unique genes. Presence/absence variation genes were found to be highly associated with stress resistance in eggplants. Genome-wide association studies identified 946 SNPs and 9605 genes significantly associated with 10 important traits. Notably, genes involved in zeatin biosynthesis closely linked to plant auxins significantly impact fruit size and shape attributes, playing a crucial role in eggplant yield. This high-quality reference genome alongside the pan-genome will provide valuable insights into eggplant breeding advancement.

RevDate: 2026-02-01

Liao X, Xi Y, Liao B, et al (2026)

Chromosome-level genome of wild-simulated Panax ginseng identifies SNP markers for germplasm and medicinal quality evaluation.

Journal of advanced research pii:S2090-1232(26)00086-X [Epub ahead of print].

INTRODUCTION: Panax ginseng C. A. Mey., a precious traditional medicinal herb, demonstrates diverse pharmacological activities, including immunomodulation and anti-fatigue effects. However, prolonged cultivation has led to germplasm admixture and cultivar degeneration, resulting in inconsistent quality that severely compromises its medicinal value and industrial standardization. Therefore, establishing accurate and efficient germplasm evaluation tools is critical for ensuring the quality of P. ginseng medicinal materials.

OBJECTIVES: To integrate whole-genome sequencing and chemical fingerprinting for ginseng germplasm identification and quality consistency assessment.

METHODS: Chromosome-level genomes of ginseng from Jilin (JA) and Liaoning (FC) were assembled using PacBio HiFi, Illumina, and Hi-C technologies. Gene family identification and phylogenetic analysis were performed across 13 representative species. Using the JA genome as the reference, we constructed a pan-genome incorporating 7 ginseng genomes to dissect gene repertoire composition and structural variation distribution across ginseng populations. Population structure analysis with 76 individual ginseng samples revealed genetic diversity, and its integration with HPLC chemical fingerprints provided a joint assessment of quality consistency.

RESULTS: We assembled two chromosome-level genomes and, through comparative genomics, revealed significant expansion of ginsenoside biosynthesis-related gene families and subgenome divergence. The core gene set accounted for 54.1% of the pan-genome, indicating high genetic conservation. SNP distribution patterns from population resequencing enabled the development of germplasm-specific molecular markers and a genetic-chemical integrated evaluation model.

CONCLUSION: The molecular marker system and genetic-chemical joint assessment developed here provide reliable novel tools for germplasm identification and quality control, advancing standardization in the ginseng industry.

RevDate: 2026-02-01

Yang YH, Yao CY, Lin MJ, et al (2026)

Unmasking human T cell receptor germline diversity: 335 novel alleles identified in 47 Pangenome reference individuals using the gAIRR Suite.

Journal of advanced research pii:S2090-1232(26)00078-0 [Epub ahead of print].

INTRODUCTION: The adaptive immune receptor repertoire (AIRR), also referred to as expressed AIRR (exprAIRR) for clarity, comprises V(D)J-recombined T cell receptors (TR) and immunoglobulins (IG), and is central to adaptive immunity. Accurate exprAIRR profiling depends on a comprehensive and population-representative germline gene set encoding AIRR (gAIRR). In addition to serving as the reference for AIRR-seq, gAIRR alleles themselves are increasingly recognized as contributors to immune-related phenotypes, including disease susceptibility, variable vaccine responsiveness, and adverse immune events OBJECTIVES: Current germline TR references form an essential foundation, but based on scientific inference, likely represent only a portion of true human diversity. Some alleles lack flanking genomic information, and several populations remain underrepresented-constraints rooted in earlier sequencing methodologies. This study aims to substantially expand these references by resolving full-length TR alleles and their flanking regions across ancestrally diverse individuals.

METHODS: We analyzed 47 high-quality, phased diploid genomes from the Human Pangenome Reference Consortium (HPRC) using the gAIRR Suite. All novel alleles were crosschecked using two orthogonal pipelines: gAIRR-annotate (assembly-based annotation) and gAIRR-seq/gAIRR-call (targeted short-read sequencing and genotyping).

RESULTS: We identified 335 novel TR alleles-305 TRV and 30 TRJ-representing 91.6% and 30.9% increases over IMGT records (v3.1.41; accessed 2025-04-19), respectively. Many novel alleles occurred at substantial frequencies, particularly among individuals of African ancestry. We further established a comprehensive flanking sequence database, including recombination signal sequences (RSS), and documented allele-specific variations in these regions. Functional annotation showed that several novel alleles exhibit altered coding potential, including transitions to pseudogene or open reading frames (ORFs) status.

CONCLUSION: This study significantly expands the known landscape of human TR germline diversity and provides a rigorously validated, population-diverse resource comprising novel alleles, flanking sequences, RSS profiles, and supporting analytical tools. These improved gAIRR references are essential for accurate germline genotyping and exprAIRR profiling, and will enable improved detection of immunogenetic associations. Our findings advance precision immunogenomics and support the development of ancestry-independent yet diversity-comprehensive genotyping, vaccines, and immunotherapies.

RevDate: 2026-01-30

Yang L, Gao Y, Kuhn KL, et al (2026)

Phased-assembly-driven pangenome graphs for structural variant genotyping and complex trait mapping in dairy cattle.

Nature communications pii:10.1038/s41467-026-68807-4 [Epub ahead of print].

Structural variants are an underexplored source of genetic diversity. As part of the FarmGTEx Project, here we report a Holstein breed-specific pangenome graph (H20D) using Minigraph-Cactus and 40 phased haploid assemblies from 20 cows. H20D outperforms both assembly- and read-based long-read callers, and far exceeds short-read approaches, identifying over 10,000 additional structural variants per sample. It also significantly improves structural variant detection and genotyping relative to graphs built across breeds or from fewer/unphased assemblies, with particular advantages in complex regions. Using H20D, we genotype variants in 173 cattle and performed a GWAS, where a larger fraction of structural variants than SNPs reach genome-wide significance, implicating them as potential causal variants. Together, these results demonstrate the power of phased, within-breed pangenome graphs for accurate SV genotyping and trait mapping in dairy cattle.

RevDate: 2026-01-30

Miao J, D Li (2026)

TEvarSim: A genome simulator for transposable element (TE) variants.

PLoS computational biology, 22(1):e1013933 pii:PCOMPBIOL-D-25-02314 [Epub ahead of print].

Transposable element (TE) variants, the presence or absence of TE sequences such as LINE-1, Alu, SVA, and endogenous retroviruses, are a major source of genomic diversity and play critical roles in human health, evolution, and disease. As interest in TE variants grows, developing related methods and tools for detection has become increasingly important. However, rigorous benchmarking of TE variant detection methods remains limited due to the lack of accurate and scalable TE variant simulation platforms and the absence of reliable ground truth data. Here, we developed TEvarSim, a novel TE variant simulator that generates TE-containing genomic data in multiple formats, including genomes, short- and long-read sequencing data, and VCF files. TEvarSim supports both random and real-world TE insertions and deletions, including variants derived from pangenome graphs. It can rapidly simulate hundreds to thousands of synthetic chromosomes or genomes and model natural variation at the haplotype, individual, and population levels, making it well suited for large-scale studies. In addition, TEvarSim can directly compare simulated VCF files with TEs reported by TE detection tools, streamlining the benchmarking of TE genotyping methods. TEvarSim provides an all-in-one toolkit for simulating, evaluating, and improving TE variant detection, advancing our ability to accurately study TEs in health and disease in various species.

RevDate: 2026-01-30
CmpDate: 2026-01-30

Wang Z, Wang J, Wang Z, et al (2025)

Genome-Wide Identification of Mitochondrial Calcium Uniporter Family Genes in the Tomato Genus and Expression Profilings Under Salt Stress.

Current issues in molecular biology, 47(12): pii:cimb47121021.

The mitochondrial calcium uniporter (MCU) is a key channel controlling mitochondrial Ca[2+] homeostasis, yet its role in plant stress responses remains unclear. Using the tomato pan-genome, this study identified 66 MCU genes across 12 tomato species and grouped them into two distinct evolutionary subfamilies. Phylogenetic, collinearity, and selection pressure analyses revealed that MCU genes are evolutionarily conserved and have undergone strong purifying selection. In addition, one MCU gene located on chromosome 6 appears to have originated before the divergence of monocots and dicots, indicating an ancient evolutionary trajectory. Gene structure and conserved motif analyses confirmed their structural conservation, while promoter cis-element analysis suggested that MCU genes are widely involved in light and hormone responsiveness. Expression profiling under salt stress showed that multiple MCU genes are differentially regulated in a time-dependent manner: SolycMCU1 and SolycMCU2 respond rapidly at early stages, whereas SolycMCU5 and SolycMCU6 are upregulated during middle and late phases. These results highlight the functional diversification of MCU genes in tomato under salt stress. This study provides the first comprehensive evolutionary and functional analysis of the tomato MCU gene family, offering insights into their stress-regulatory mechanisms and potential use in breeding salt-tolerant tomatoes.

RevDate: 2026-01-29

Ontano A, Sim SB, Jenkins J, et al (2026)

Independent centromeric expansions define giant hornet genomes.

BMC genomics pii:10.1186/s12864-025-12512-x [Epub ahead of print].

RevDate: 2026-01-28

Shi L, Zhang M, Zheng R, et al (2026)

Comparative genomics reveals two major lineages of Bifidobacterium adolescentis in the human gut, driven by divergent adaptation in China and the United States.

Journal of advanced research pii:S2090-1232(26)00098-6 [Epub ahead of print].

INTRODUCTION: Bifidobacterium adolescentis is a key beneficial member of the human gut microbiota, but its genomic diversity and evolutionary drivers across human populations remain poorly characterized.

OBJECTIVES: Understanding genomic functional heterogeneity and evolutionary patterns in human gut-derived B. adolescentis.

METHODS: We performed a comparative genomic analysis of 395B. adolescentis, mainly from China (n = 169) and the U.S. (n = 146), with smaller sets from Australia, Italy, and the United Kingdom, to investigate functional heterogeneity and evolutionary mechanisms. Our analysis integrated core and pan-genome architecture, phylogenomics, single nucleotide polymorphism (SNP)-based population structure, carbohydrate-active enzyme profiles, CRISPR-Cas systems, antibiotic resistance genes, and recombination dynamics.

RESULTS: The pan-genome was open and highly plastic. Phylogenetic reconstruction identified two major clades with strong geographic stratification: Chinese isolates predominantly clustered in Clade B, while U.S. isolates grouped in Clade A. Functional annotation showed regional specialization in carbohydrate-active enzymes, with Chinese isolates enriched in glycosyltransferase families and U.S. isolates in carbohydrate-binding module and carboxylesterase families, likely reflecting dietary adaptations. Genomic islands were hotspots for horizontal gene transfer, harboring region-specific carbohydrate-active enzymes and antibiotic resistance genes such as tet(W/32/O) and ermX, which were frequently located in Chinese isolates. Recombination was found to be the primary driver of genetic diversity, with recombination-to-mutation ratios approaching and exceeding 3.0 in Chinese and U.S. isolates. Linkage disequilibrium decay further supported higher recombination rates in these populations.

CONCLUSION: B. adolescentis has diverged into two major genomic lineages, primarily associated with isolates from China and the U.S. This divergence reflects adaptation to distinct host-associated ecological factors, such as diet, antibiotic exposure, and lifestyle, and is predominantly driven by extensive homologous recombination rather than point mutations. These findings highlight how regional selective pressures shape the genomic and functional landscape of this key gut symbiont.

RevDate: 2026-01-28
CmpDate: 2026-01-28

Allemailem KS (2025)

Pangenome-Guided Reverse Vaccinology and Immunoinformatics Approach for Rational Design of a Multi-Epitope Subunit Vaccine Candidate Against the Multidrug-Resistant Pathogen Chromobacterium violaceum: A Computational Immunopharmacology Perspective.

Pharmaceuticals (Basel, Switzerland), 19(1): pii:ph19010029.

Background: Chromobacterium violaceum is an emerging multidrug-resistant (MDR) Gram-negative bacterium associated with severe septicemia, abscess formation, and high mortality, particularly in immunocompromised individuals. Increasing antimicrobial resistance and the absence of approved vaccines underscore the urgent need for alternative preventive strategies. Traditional vaccine approaches are often inadequate against genetically diverse MDR pathogens, prompting the use of computational immunology and reverse vaccinology for vaccine design. Objectives: This study aimed to design and characterize a novel multi-epitope subunit vaccine (MEV) candidate against C. violaceum using a comprehensive pangenome-guided subtractive proteomics and immunoinformatics pipeline to identify conserved antigenic targets capable of eliciting strong immune responses. Methods: Comparative genomic analysis across eight C. violaceum strains identified 3144 core genes. Subtractive proteomics filtering yielded two essential, non-homologous, surface-accessible, and antigenic proteins-penicillin-binding protein 1A (Pbp1A) and organic solvent tolerance protein (LptD)-as vaccine targets. Cytotoxic T-lymphocyte (CTL), helper T-lymphocyte (HTL), and B-cell epitopes were predicted and integrated into a 272-amino-acid MEV construct adjuvanted with human β-defensin-4A using optimal linkers. The construct was evaluated through structural modeling, molecular docking with TLR4, molecular dynamics simulation, immune simulation, and in silico cloning into the pET-28a(+) vector. Results: The MEV construct exhibited strong antigenicity, non-allergenicity, and non-toxicity, with stable tertiary structure and favorable physicochemical properties. Docking and dynamics simulations demonstrated high binding affinity and stability with TLR4 (ΔG = -16.2 kcal/mol), while immune simulations predicted durable humoral and cellular immune responses with broad population coverage (≈89%). Codon optimization confirmed high expression potential in E. coli K12. Conclusions: The pangenome-guided immunoinformatics approach enabled the identification of conserved antigenic proteins and rational design of a promising multi-epitope vaccine candidate against MDR C. violaceum. The construct exhibits favorable immunogenic and structural features, supporting its potential for experimental validation and future development as a preventive immunotherapeutic against emerging MDR pathogens.

RevDate: 2026-01-28
CmpDate: 2026-01-28

Thant EP, Klaysubun C, Suwannasin S, et al (2026)

Global Comparative Genomics of Stenotrophomonas maltophilia Reveals Cryptic Species Diversity, Resistome Variation, and Population Structure.

Life (Basel, Switzerland), 16(1): pii:life16010158.

Background:Stenotrophomonas maltophilia is an increasingly important multidrug-resistant opportunistic pathogen frequently isolated from clinical, environmental, and plant-associated niches. Despite its medical relevance, the global population structure, species-complex boundaries, and genomic determinants of antimicrobial resistance (AMR) and ecological adaptation remain poorly resolved, partly due to inconsistent annotations and fragmented genomic datasets. Methods: Approximately 2400 genome assemblies annotated as Stenotrophomonas maltophilia were available in the NCBI Assembly database at the time of query. After pre-download filtering to exclude metagenome-assembled genomes and atypical lineages, 1750 isolate genomes were retrieved and subjected to stringent quality control (completeness ≥ 90%, contamination ≤ 5%, ≤500 contigs, N50 ≥ 10 kb, and ≤1% ambiguous bases), yielding a final curated dataset of 1518 high-quality genomes used for downstream analyses. Genomes were assessed using CheckM, annotated with Prokka, and compared using average nucleotide identity (ANI), pan-genome analysis, core-genome phylogenomics, and functional annotation. AMR genes, mobile genetic elements (MGEs), and metadata (source, host, and geographic origin) were integrated to assess lineage-specific genomic features and ecological distributions. Results: ANI-based clustering resolved the S. maltophilia complex into multiple distinct genomospecies and revealed extensive misidentification of publicly deposited genomes. The pan-genome was highly open, reflecting strong genomic plasticity driven by accessory gene acquisition. Core-genome phylogeny resolved well-supported clades associated with clinical, environmental, and plant-related niches. Resistome profiling showed widespread intrinsic MDR determinants, with certain lineages enriched for efflux pumps, β-lactamases, and trimethoprim-sulfamethoxazole resistance markers. MGE analysis identified lineage-specific integrative conjugative elements, prophages, and transposases that correlated with source and geographic distribution. Conclusions: This large-scale analysis provides the most comprehensive genomic overview of the S. maltophilia complex to date. Our findings clarify species boundaries, highlight substantial taxonomic misannotation in public databases, and reveal lineage-specific AMR and mobilome patterns linked to ecological and clinical origins. The curated dataset and evolutionary insights generated here establish a foundation for global genomic surveillance, epidemiological tracking, and future studies on the evolution of antimicrobial resistance in S. maltophilia.

RevDate: 2026-01-28
CmpDate: 2026-01-28

Wonglapsuwan M, Ninrat T, Chaichana N, et al (2025)

Global Genomic Landscapes of Lactiplantibacillus plantarum: Universal GABA Biosynthetic Capacity with Strain-Level Functional Diversity.

Life (Basel, Switzerland), 16(1): pii:life16010047.

Lactiplantibacillus plantarum is widely used in fermented foods and as a probiotic, yet the genomic basis underlying its γ-aminobutyric acid (GABA) production capacity and strain-level functional diversity remains incompletely resolved. We analyzed 1240 publicly available genomes to map species-wide genome architecture, the distribution of GABA-related genes, and accessory drivers of phenotypes. Pangenome analysis identified 45,201 gene families, including 622 strict core genes (1.38%) and 444 soft-core genes (2.36%). The accessory genome dominated (3138 shell and 40,997 cloud genes; 97.64%), indicating a strongly open pangenome. In contrast, the GABA (gad) operon was universally conserved: gadB (glutamate decarboxylase) and gadC (glutamate/GABA antiporter) were present in all genomes regardless of isolates source. Accessory-genome clustering revealed ecological and geographic structure without loss of the operon, suggesting that phenotypes variability relevant to fermentation and probiotic performance is primarily shaped by accessory modules. Accessory features included carbohydrate uptake and processing islands, bacteriocins and immunity systems, stress- and membrane-associated functions, and plasmid-encoded traits. Analysis of complete genomes confirmed substantial variation in plasmid load (median = 2; range = 0-17), highlighting the role of mobile elements in niche-specific adaptation. Carbohydrate-Active Enzymes database (CAZy) and biosynthetic gene cluster (BGC) profiling revealed a conserved enzymatic and metabolic backbone complemented by rare lineage-specific functions. Collectively, these results position L. plantarum as a genetically stable GABA producer with extensive accessory-encoded flexibility and provide a framework for rational strain selection.

RevDate: 2026-01-28
CmpDate: 2026-01-28

Romanenko L, Eremeev V, Bystritskaya E, et al (2026)

Genomic Insights into Marinovum sedimenti sp. nov., Isolated from Okhotsk Sea Bottom Sediments, Suggest Plasmid-Mediated Strain-Specific Motility.

Microorganisms, 14(1): pii:microorganisms14010125.

Two Gram-negative aerobic halophilic bacteria, designated KMM 9989[T] and KMM 9879, were isolated from a bottom sediment sample of the Okhotsk Sea, Russia. The novel strains grew in 0.5-4% NaCl, at 5-35 °C and pH 5.5-10.0. Phylogenetic analyses based on 16S rRNA gene and whole genome sequences placed strains KMM 9989[T] and KMM 9879 within the family Roseobacteraceae, where they were clustered with their closest relative Marinovum algicola KCTC 22095[T]. The average nucleotide identity (ANI) between strain KMM 9989[T] and Marinovum algicola KCTC 22095[T] was 81.4%. The level of digital DNA-DNA hybridization (dDDH) between the novel isolates KMM 9989[T] and KMM 9879 was 97%, while between strain KMM 9989[T] and Marinovum algicola KCTC 22095[T], it was 27%. Strains KMM 9989[T] and KMM 9879 contained Q-10 as the predominant ubiquinone and C18:1ω7c as the major fatty acid. The polar lipids were phosphatidylcholine, phosphatidylglycerol, phosphatidylethanolamine, diphosphatidylglycerol, an unidentified aminolipid, two unidentified phospholipids, and three unidentified lipids. The genomic size of strains KMM 9989[T] and KMM 9879 was determined to be 4,040,543 bp and 3,969,839 bp with a DNA GC content of 61.3 and 61.4 mol%, respectively. Both strains contained a common plasmid of 238,277 bp and a strain-specific plasmid (188,734 bp for KMM 9989[T] and 118,029 bp for KMM 9879). It is suggested that the motility of KMM 9879 may be mediated by the presence of a complete fla2-type operon in the strain-specific chromid. Thus, based on the phylogenetic analyses and distinctive phenotypic characteristics, the novel marine strains KMM 9989[T] and KMM 9879 are proposed to be classified as a novel species Marinovum sedimenti sp. nov. with the strain KMM 9989[T] (=KCTC 8835[T]) as the type strain of the species.

RevDate: 2026-01-28
CmpDate: 2026-01-28

Tamayo-Ordóñez YJ, Rosas-García NM, Bello-López JM, et al (2026)

A Possible Recently Identified Evolutionary Strategy Using Membrane-Bound Vesicle Transfer of Genetic Material to Induce Bacterial Resistance, Virulence and Pathogenicity in Klebsiella oxytoca.

International journal of molecular sciences, 27(2): pii:ijms27020988.

Klebsiella oxytoca has emerged as an important opportunistic pathogen in nosocomial infections, particularly during the COVID-19 pandemic, due to its capacity to acquire and disseminate resistance and virulence genes through horizontal gene transfer (HGT). This study presents a genome-based comparative analysis of K. oxytoca within the genus Klebsiella, aimed at exploring the evolutionary plausibility of outer membrane vesicle (OMV) associated processes in bacterial adaptation. Using publicly available reference genomes, we analyzed pangenome structure, phylogenetic relationships, and the distribution of mobile genetic elements, resistance determinants, virulence factors, and genes related to OMV biogenesis. Our results reveal a conserved set of envelope associated and stress responsive genes involved in vesiculogenic pathways, together with an extensive mobilome and resistome characteristic of the genus. Although these genomic features are consistent with conditions that may favor OMV production, they do not constitute direct evidence of functional OMV mediated horizontal gene transfer. Instead, our findings support a hypothesis generating evolutionary framework in which OMVs may act as a complementary mechanism to established gene transfer routes, including conjugation, integrative mobile elements, and bacteriophages. Overall, this study provides a genomic framework for future experimental and metagenomic investigations into the role of OMV-associated processes in antimicrobial resistance dissemination and should be interpreted as a recently identified evolutionary strategy inferred from genomic data, rather than a novel or experimentally validated mechanism.

RevDate: 2026-01-28
CmpDate: 2026-01-28

Yin R, Liu H, Lin S, et al (2026)

Identification and Analysis of the Terpene Synthases (TPS) Gene Family in Camellia Based on Pan-Genome.

Genes, 17(1): pii:genes17010094.

Terpenes are major determinants of tea aroma, and terpene synthases (TPSs) catalyze key steps in terpenoid biosynthesis. To capture gene-family variation beyond a single reference, we performed a pan-genome-based analysis of TPS genes across nine Camellia genomes (three wild tea relatives and six cultivated Camellia sinensis accessions) and integrated pan-transcriptome profiling across eight tissues. We identified 381 TPS genes; wild species contained more TPSs than cultivated accessions (mean 58.3 vs. 34.3), suggesting a putative contraction. Phylogenetic analysis with Arabidopsis TPSs classified Camellia TPSs into five subfamilies, dominated by TPS-b (149) and TPS-a (140), whereas TPS-c was rare (8). Gene-structure and physicochemical analyses revealed marked subfamily divergence, with TPS-c showing highly conserved coding-region length. Orthology clustering assigned 355 TPSs to 19 orthogroups, including five core groups (190 genes, 53.5%) and 14 dispensable groups (165 genes, 46.5%); core/non-core status was significantly associated with subfamily composition. Tandem and proximal duplication contributed most to TPS expansion (29.4% and 29.1%), and all orthogroups exhibited copy-number variation, with pronounced lineage-specific expansions. Ka/Ks analyses indicated pervasive purifying selection (median 0.516) but heterogeneous constraints among subfamilies. Finally, cultivated tea showed higher TPS expression in most tissues, especially mature leaf and stem, and TPS-g displayed the broadest and strongest expression. Together, these results provide a pan-genome resource for Camellia TPSs and highlight how domestication, duplication, and CNV shape terpene-related genetic diversity.

RevDate: 2026-01-28
CmpDate: 2026-01-28

Wang X, Cheng S, B Du (2025)

Pan-Genomic Identification and Analysis of the Maize BBX Family.

Genes, 17(1): pii:genes17010046.

BBX transcription factors play crucial roles in plant growth, development, and stress resistance. Utilizing maize whole-genome data, we identified 35 members of the maize BBX gene family, comprising 18 core genes, 14 near-core genes, 4 non-essential genes, and 150 private genes. The phylogenetic tree constructed using Arabidopsis thaliana revealed that the fourth subfamily contained the largest number of core genes, totaling eight, and exhibited significant diversity throughout the evolutionary process of maize. The Ka/Ks ratios of the BBX family members in the 26 genomes indicated that, except for ZmBBX20 and ZmBBX42 under positive selection, the remaining genes were subjected to purifying selection. Further analysis combining transcriptome data and RT-qPCR demonstrated that maize BBX family member expression levels changed significantly in response to cold stress after cold treatment, highlighting their important roles in abiotic stress responses. In summary, in this study, we utilized the maize pan-genome and bioinformatics approaches to investigate maize BBX family member evolutionary relationships and functional roles, providing a new theoretical framework for further research on this gene family.

RevDate: 2026-01-28

Lin L, Zheng X, Tao Y, et al (2026)

Genome-resolved metagenomics uncovers diversity and functional landscapes of the gastrointestinal epithelium-associated microbiome in cattle.

Genome biology pii:10.1186/s13059-026-03960-z [Epub ahead of print].

BACKGROUND: The ruminant gastrointestinal epithelium harbors a diverse and functionally critical remains poorly characterized microbial community due to persistent host-derived DNA contamination in metagenomic studies.

RESULTS: We develop Dilute-MetaSeq (dilution-based metagenomic sequencing), a novel, metagenomic workflow integrating gradient dilution with multiple displacement amplification. Dilute-MetaSeq reduces host DNA interference by 52.4-fold and achieves > 90% microbial sequencing efficiency to assess gastrointestinal epithelium-associated microbiome. This enables the construction of the microbial genome atlas of gastrointestinal epithelium (MGA-GE). This comprehensive resource, comprising 1,907 nonredundant prokaryotic and 5,603 viral genomes, reveals extraordinary microbial diversity and novelty, with 41.4% of prokaryotic and 99.9% of viral genomes representing taxonomically unclassified lineages. Spatial profiling identifies the rumen and reticulum as a biodiversity hotspot dominated by epithelium-adapted Butyrivibrio and methylotrophic Methanomassiliicoccales, while functional annotation uncovers 1,200 biosynthetic gene clusters (primarily RiPPs and NRPSs) and 1,212 viral auxiliary metabolic genes linked to host metabolism modulation. Pangenome analysis of 987 strains, including a novel Butyrivibrio clade with reduced genome sizes, elevated GC content, and butyrate synthesis from amino acid-derived substrates (e.g., glutarate, lysine), highlights metabolic adaptations to the nutrient-scarce epithelial niche compared to digesta-associated microbes.

CONCLUSIONS: Collectively, the MGA-GE provides transformative insights into host-microbe-virus interactions and establishes a foundation for developing microbiome-based intervention strategies to enhance ruminant health, agricultural productivity, and bioactive discovery.

RevDate: 2026-01-27
CmpDate: 2026-01-27

Virieux-Petit M, Aujoulat F, Dupont C, et al (2026)

Adaptive evolution of Pseudomonas aeruginosa ST299 population colonizing a hospital copper water network over a 2.5-year period.

Microbial genomics, 12(1):.

Background. Descriptions of genomic characters and dynamics related to Pseudomonas aeruginosa (PA) adaptation and survival in hospital water networks remain scarce but necessary for sustainable water management in hospitals.Methods. A new copper water network in an intensive care unit (ICU) was chronically colonized by a genotype sequence type (ST) 299 of PA and sporadically by a genotype ST2685. Sixty-eight ST299-PA and four ST2685-PA strains from ICU-water samples collected over 29 months and 4 months, respectively, were studied for genomic adaptation to copper water network. PFGE and whole-genome sequencing allowing SNP, comparative genomics, resistome, virulome and pangenome analyses were performed. In order to understand the adaptive phenomena linked to the colonization niche, 16 isolates of ST299-PA colonizing a cystic fibrosis (CF) patient during 16 months were included.Results and discussion. The 68 ST299-PA ICU-water differed by <0.15 SNPs on average. No recombination regions nor patho-adaptive mutations were identified. Resistome, virulome and pangenome were stable over the time. The genomic content included a copper resistance operon, mainly metal resistance genes, a Tn4661-like transposon, the class 1 integron and broad-spectrum efflux pumps. These elements, absent in ST2685-PA ICU-water, support the survival of ST299-PA in the copper water network. The evolutionary speed of ST299-PA CF was faster with 12.9 SNPs among strains mostly affecting genes of patho-adaptation, arguing that the primo-colonization strain was probably not adapted to the niche, in contrast to the high genomic stability observed for the ST299-PA ICU-water population signifying the primary adaptation to the water network.

RevDate: 2026-01-27

Miraeiz E, Borges Dos Santos L, ME Hudson (2026)

Modern Genomics Reshapes Soybean Cyst Nematode Research: Integrating Host Resistance, Nematode Virulence, and Functional Discovery.

Molecular plant-microbe interactions : MPMI [Epub ahead of print].

The soybean cyst nematode (SCN), Heterodera glycines, remains the most damaging pathogen of soybean worldwide. Genomic advances over the past two decades have transformed our understanding of both the nematode and its host. On the soybean side, genome sequencing, pangenome development, and multi-omics studies have clarified how classical resistance loci such as Rhg1 and Rhg4 function while also revealing a broader and more complex landscape of resistance mechanisms. On the nematode side, effector discovery, high-quality genome assemblies, and evolutionary analyses have shed light on how SCN adapts to resistant cultivars and remodels host cellular processes. Despite these advances, overreliance on a single resistance source, PI 88788, continues to accelerate virulence shifts in SCN populations. This underscores the urgent need for diversified resistance, improved monitoring of nematode adaptation, and deeper mechanistic insight into the interaction. In this review, we integrate current knowledge of soybean-SCN interactions across genomics, transcriptomics, proteomics, cell biology, and microbiome research. We highlight how integrative functional genomics is reshaping the discovery of resistance genes, clarifying nematode virulence strategies, and guiding the development of more durable management approaches. Finally, we outline emerging directions, including pangenomics, dual host-pathogen analyses, and predictive breeding, that are expected to advance innovation in SCN control. [Formula: see text] Copyright © 2026 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.

RevDate: 2026-01-27

Conrad RE, Tsementzi D, Meziti A, et al (2026)

Metagenome-based vertical profiling of the Gulf of Mexico highlights its uniqueness and far-reaching effects of freshwater input.

Applied and environmental microbiology [Epub ahead of print].

Genomic and metagenomic explorations of the oceans have identified well-structured microbial assemblages showing endemic genomic adaptations with increasing depth. However, deep water column surveys have been limited, especially of the Gulf of Mexico (GoM) basin, despite its importance for human activities. To fill this gap, we report on 19 deeply sequenced (~5 Gbp/sample) shotgun metagenomes collected along a vertical gradient, from the surface to about 2,000 m deep, at three GoM stations. Beta diversity analysis revealed strong clustering by depth, and not by station. However, a community-level pangenome style gene content analysis revealed ~54% of predicted gene sequences to be station-specific within our GoM samples. Of the 154 medium-to-high-quality MAGs recovered, 145 represent novel species compared with the NCBI genomes and Tara Oceans MAGs databases. Two of these MAGs were relatively abundant at both surface and deep samples, revealing remarkable versatility across the water column. A few MAGs of freshwater origin (~6% of total detected) were relatively abundant at 600 m deep and 270 miles from the coast at one station, revealing that the effects of freshwater input in the GoM can sometimes be far-reaching and long-lasting. Notably, 1,447/16,068 of the total COGs detected were positively (Pearson's r ≥ 0.5) or negatively (Pearson's r ≤ -0.5) correlated with depth, including beta-lactamases, dehydrogenases, and CoA-associated oxidoreductases. Taken together, our results reveal substantial novel genome and gene diversity across the GoM's water column, and testable hypotheses for some of the diversity patterns observed.IMPORTANCETo what extent microbial communities are similar between different ocean basins at similar depths, and what the impact of freshwater input by major rivers may be on these communities, remain poorly understood issues with potentially important implications for modeling and managing marine biodiversity. In this study, we performed metagenomic sequencing and recovered 154 medium-to-high-quality metagenome-assembled genomes (MAGs) from three stations in the Gulf of Mexico (GoM) and from various depths up to about 2,000 m. Comparison to MAGs recovered from other ocean basins highlighted the unique diversity harbored by the GoM, which could be driven by more substantial input from the Mississippi River and by human activities, including offshore oil drilling. The data and results provided by this study should be useful for future comparative analysis of marine biodiversity and contribute to its more complete characterization.

RevDate: 2026-01-26

Zhou W, Chen D, Dong X, et al (2026)

Emergence and genomic adaptation of the globally disseminated ST2250 lineage within the Staphylococcus aureus complex.

Antimicrobial agents and chemotherapy [Epub ahead of print].

Staphylococcus argenteus, a member of the S. aureus complex, is increasingly recognized as a globally distributed pathogen with significant clinical relevance. Among its lineages, sequence type (ST) 2250 has emerged as the most prevalent and geographically widespread, yet its evolutionary history and genomic adaptations remain incompletely understood. In this study, we conducted a comprehensive genomic analysis of 277 ST2250 genomes from 26 countries between 2008 and 2025, integrating 14 newly sequenced isolates from China. Phylogenetic reconstruction resolved a basal clade I around 1989 and sister clades II and III that diversified later, in approximately 1996 and 1997, with frequent cross-regional, intercontinental, and cross-host transmission events. A methicillin-resistant S. argenteus subclade within clade II likely arose from a single SCCmec IVc acquisition, accompanied by a blaZ-carrying plasmid. Clade III genomes carried a related multidrug-resistant (MDR) plasmid encoding blaZ, tet(L), and aph(3')-III; Bayesian phylogenetic inference indicated that this plasmid was introduced into the ancestor of the clade III MDR subclade around 2001, potentially promoting its subsequent expansion. Both clades also exhibited enriched virulence profiles, particularly the secretion system gene esaG7. Despite the widespread presence of active defense systems that might limit the acquisition of mobile genetic elements, the ST2250 pan-genome remains open, with evidence of active gene flux and convergent selection targeting resistance, virulence, and metabolic pathways. These findings elucidate the global spread, ecological plasticity, and adaptive evolution of ST2250, providing critical genomic insights into the emergence and persistence of this lineage.

RevDate: 2026-01-28
CmpDate: 2026-01-26

Fang M, Wang X, Yu X, et al (2025)

Clonal replacement by a P1-1/ST3 lineage in pediatric Mycoplasma pneumoniae, Jinan, China, 2021-2024.

Frontiers in cellular and infection microbiology, 15:1732239.

INTRODUCTION: After a prolonged lull during COVID-19 non-pharmaceutical interventions, Mycoplasma pneumoniae activity re-emerged in 2023 in multiple regions; in China this occurred against a backdrop of very high macrolide resistance. We conducted a retrospective single-center study of pediatric M. pneumoniae pneumonia in Jinan, comparing a pre-resurgence period (2021) with 2023-2024.

METHODS: Clinical data were linked to whole-genome sequencing of 227 cultured isolates. We assessed lineage composition and relatedness using core-genome phylogenetics and SNP-threshold networks, and compared diversity and pan-genome functional profiles across major clades. Phenotypic antimicrobial susceptibility testing was performed.

RESULTS: The proportion of severe cases increased from 7.4% (2021) to 19.9% (2024). Over the same interval, the P1-1/ST3 lineage rose from 41.9% to 84.0%, displacing previously co-circulating lineages. Core-genome analyses indicated reduced diversity and a compact ST3 cluster within the T1-3R subclade of the P1-type 1 lineage (EC1 clone), alongside a smaller P1-type 2/T2-2 (EC2/ST14) clade. Using a ≤11-SNP threshold, 74% of isolates fell within the largest connected component. Pan-genome comparisons suggested enrichment of replication/recombination/repair functions in T1-3R, whereas canonical adhesion factors and the CARDS toxin were conserved. All isolates carried the 23S rRNA A2063G substitution with phenotypic macrolide resistance, while in vitro susceptibility to tetracycline and levofloxacin was retained.

DISCUSSION: The 2023-2024 resurgence coincided with clonal replacement by P1-1/ST3 in a setting of fixed macrolide resistance and an increase in severe pediatric disease. Given the retrospective, culture-based design, this should be interpreted as a temporal association rather than evidence that ST3 intrinsically caused more severe disease. These findings support consideration of non-macrolide agents in similar high-resistance settings and motivate prospective genomic-clinical surveillance.

RevDate: 2026-01-28
CmpDate: 2026-01-26

van Workum DM, Dey KK, Kozik A, et al (2026)

MoGAAAP: a modular Snakemake workflow for automated genome assembly and annotation with quality assessment.

NAR genomics and bioinformatics, 8(1):lqag008.

With the current speed of sequencing, there is a desire for standardized and automated genome assembly and annotation to produce high-quality genomes as input for comparative (pan)genomics. Therefore, we created a convenience pipeline using existing tools that creates annotated genome assemblies from HiFi (and optionally ultra-long ONT and/or Hi-C) reads for a set of related individuals as well as a related reference genome. Our pipeline is species-agnostic and generates an extensive quality assessment report that can be used for manual filtering and refinement of the assembly and annotation. It includes statistics for individual completeness and contamination assessments as well as a concise pangenome view. The pipeline is implemented in Snakemake and available with a GPLv3 licence at GitHub under github.com/dirkjanvw/MoGAAAP, at Zenodo under doi.org/10.5281/zenodo.14833021, and can be installed through Bioconda.

RevDate: 2026-01-25

Duarte-Zambrano L, Nava-Domínguez N, Mireles-Dávalos CD, et al (2026)

Virulence and genomic features of hypervirulent Klebsiella pneumoniae Species Complex.

Microbial pathogenesis pii:S0882-4010(26)00031-8 [Epub ahead of print].

Hypervirulent Klebsiella pneumoniae is a pathotype capable of causing invasive infections with high morbidity and mortality rates. In this study, we conducted a surveillance analysis of hypervirulent isolates circulating in Mexico to characterize their phenotypic and genomic features. Presumptive hypervirulent isolates were identified at a frequency of 6.48% (19/293), comprising 17 K. pneumoniae sensu stricto and two K. quasipneumoniae subsp. similipneumoniae. Isolates were predominantly recovered from male patients (12/19, 63%). Clinical samples were obtained from lower respiratory tract (15/19, 78.9%), blood (3/19, 15.7%), and pleural fluid (1/19, 5.2%). Further genetic and phenotypic analyses revealed substantial heterogeneity among these strains, including significant phenotype-genotype discordance. Notably, this cohort includes the first identified convergent hypervirulent K. pneumoniae strain in Mexico, as well as two hypervirulent K. quasipneumoniae isolates, a phenomenon that is less frequent in K. pneumoniae than in K. pneumoniae. These discrepancies prompted us to propose a local classification scheme based on the presence of virulence-associated genes, lethality in mice and antimicrobial susceptibility. Phylogenetic and pangenome analysis revealed clustering patterns associated with sequence types and capsule types. The data generated in this study contribute to a deeper understanding of Hypervirulent K. pneumoniae Species Complex biology and provide valuable insights into the diversity of strains currently circulating in Mexico. Hypervirulent Klebsiella pneumoniae is a pathotype capable of causing invasive infections with high morbidity and mortality rates. In this study, we conducted a surveillance analysis of hypervirulent isolates circulating in Mexico to characterize their phenotypic and genomic features. Presumptive hypervirulent Klebsiella pneumoniae Species Complex (pHvKpSC) isolates were identified at a frequency of 6.48% (19/293), comprising 17 K. pneumoniae sensu stricto and two K. quasipneumoniae subsp. similipneumoniae. Isolates were predominantly recovered from male patients (12/19, 63%). Clinical samples were primarily obtained from the lower respiratory tract (15/19, 78.9%), followed by blood (3/19, 15.7%), and pleural fluid (1/19, 5.2%). Genetic and phenotypic analyses revealed substantial heterogeneity, including significant phenotype-genotype discordance. Notably, this cohort includes the first identified convergent hypervirulent K. pneumoniae strain in Mexico, as well as two hypervirulent K. quasipneumoniae isolates, a phenomenon significantly less frequent in the latter species than in K. pneumoniae. This observed heterogeneity prompted the proposal of a local classification scheme based on virulence-associated genes, murine lethality and antimicrobial susceptibility. Phylogenetic and pangenome analyses revealed clustering patterns associated with sequence and capsule types. These data contribute to a deeper understanding of the Hypervirulent K. pneumoniae Species Complex and provide valuable insights into the diversity of high-threat strains currently circulating in Mexico.

RevDate: 2026-01-25

Rodrigo CH, Kulappu Arachchige SN, Zare S, et al (2026)

Comparative genomics of Ornithobacterium spp. isolated from free range layer chickens with respiratory infections unveils marked genetic diversity and putative new species.

Veterinary microbiology, 314:110897 pii:S0378-1135(26)00028-3 [Epub ahead of print].

The bacterium Ornithobacterium rhinotracheale causes upper respiratory tract infections (URTI) in commercial poultry worldwide. Efficient diagnostic and control of this emerging pathogen require accurate understanding of its classification, prevalence and distribution. The present study explores the genetic diversity of sixty-seven organisms presumptively identified as Ornithobacterium and recovered from chickens with URTIs in Australian free-range layer farms. Rep-PCR fingerprinting revealed wide diversity of isolates between and within farms and sites of infection. Forty representative isolates were sequenced entirely and compared to published genomes. Sequence alignments of the rpoB gene supported their classification into the genus Ornithobacterium, and 16S rRNA analysis revealed 98.08 % to 100 % identity with O. rhinotracheale type-strain DMS15997. However, most isolates gave non-interpretable profiles with the current Multi Locus Sequence Typing (MLST) scheme. Average Nucleotide Identity (ANI) analysis separated the dataset into four genetically divergent clusters. Most of the published O. rhinotracheale genomes, including DMS15997, belonged to the largest group, whereas the other clusters contained isolates with ANI values ranging from 84 % to 92 % against DMS15997, suggesting the presence of new species or sub-species. Pan-genome analysis was consistent with these observations, identifying only a small set of core genes (n = 254) in the dataset, while delineating distinct subsets of accessory proteins for each ANI cluster. Core single nucleotide polymorphism phylogeny confirmed further the substantial genetic diversity of the isolates. This study underlines the complex epidemiology and taxonomy of Ornithobacterium-associated URTIs in poultry farms, and is expected to improve diagnostic and control programs for this pathogen.

RevDate: 2026-01-23

Tagg KA, Peñil-Celis A, Webb HE, et al (2026)

Pangenome dynamics and population structure of the zoonotic pathogen Salmonella enterica serotype Hadar.

Nature communications pii:10.1038/s41467-025-68026-3 [Epub ahead of print].

The bacterial accessory genome, comprised of plasmids, phages, and other mobile elements, underpins the adaptability of bacterial populations. Pangenome (core and accessory) analysis of pathogens can reveal epidemiological relatedness missed by using core-genome methods alone. Employing a k-mer-based Jaccard Index approach to compute pangenome relatedness, we explore the population structure and epidemiology of Salmonella enterica serotype Hadar (Hadar), an emerging zoonotic pathogen in the United States (U.S.) linked to both commercial and backyard poultry. A total of 3384 U.S. Hadar genomes collected between 1990 and 2023 are analyzed here. Hadar populations underwent substantial shifts between 2019 and 2020 in the U.S., driven by the expansion of a lineage carrying a previously uncommon prophage-like element. Phylogenetic and pangenomic relatedness, coupled with epidemiological data, suggest this lineage emerged from extant populations circulating in commercial poultry, with subsequent dissemination into backyard poultry environments. We demonstrate the utility of pangenomic approaches for mapping vertical and horizontal diversity and informing complex dynamics of zoonotic bacterial pathogens.

RevDate: 2026-01-23

Torres-Higuera LD, Rojas-Tapias DF, Jiménez-Velásquez S, et al (2026)

Comprehensive genotyping and taxonomic analysis uncovers extensive distribution of intermediate Leptospira species in Colombia.

World journal of microbiology & biotechnology, 42(2):57.

Leptospirosis, a globally prevalent zoonosis caused by pathogenic and intermediate Leptospira species, poses significant threats to public health and livestock industries. Despite its substantial impact, knowledge gaps persist regarding the prevalence and genetic diversity of Leptospira strains in many regions, including South America. This study aimed to characterize a diverse collection of Leptospira strains isolated from various sources in Colombia to enhance our understanding of the genetic diversity within this genus. Using a tiered approach combining conventional and genomic methods, we genotyped 55 isolates from various sources using 16S rRNA and rpoB gene sequencing, DNA ribotyping, and Multiple-Locus Variable-Number Tandem Repeat Analysis (MLVA). Most isolates were classified into phylogenetic groups containing pathogenic and intermediate strains of L. interrogans and L. wolffii, respectively, which was corroborated by ribotyping and MLVA. Whole-genome sequencing of selected strains revealed distinct genomic characteristics compared to related strains. Pan-genome analysis identified strain-specific genes, primarily hypothetical, while virulence factor analysis distinguished species-specific patterns. Furthermore, CRISPR-Cas system analysis uncovered genetic variations among the isolates. This study provides a framework for understanding Leptospira genetic diversity in Colombia and its potential implications on human and animal health. Our findings highlight the need for improved diagnostic methods and surveillance strategies that encompass both pathogenic and intermediate Leptospira species, which could significantly impact public health policies and veterinary practices in the region.

RevDate: 2026-01-23

Sui Y, Lin J, Noyes MD, et al (2026)

Using the linear references from the pangenome to discover missing autism variants.

Nature communications pii:10.1038/s41467-026-68378-4 [Epub ahead of print].

To better understand large-effect pathogenic variation associated with autism, we generated long-read sequencing (LRS) data to construct phased and near-complete genome assemblies (average contig N50 = 43 Mbp, QV = 56) for 189 individuals from 51 families with unsolved cases. We applied read- and assembly-based strategies to facilitate comprehensive characterization of de novo mutations, structural variants (SVs), and DNA methylation. Using LRS pangenome controls, we efficiently filtered >97% of common SVs exclusive to 87 offspring. We find no evidence of increased autosomal SV burden for probands when compared to unaffected siblings yet observe a suggestive trend toward an increased SV burden on the X chromosome among affected females. We establish a workflow to prioritize potential pathogenic variants by integrating autism risk genes and putative noncoding regulatory elements defined from ATAC-seq and CUT&Tag data from the developing cortex. In total, we identified three pathogenic variants in TBL1XR1, MECP2, and SYNGAP1, as well as nine candidate de novo and biallelic inherited homozygous SVs, most of which were missed by short-read sequencing. Our work highlights the potential of phased genomes to discover complex more pathogenic mutations and the power of the pangenome to restrict the focus on an increasingly smaller number of SVs for clinical evaluation.

RevDate: 2026-01-23
CmpDate: 2026-01-23

Gao S, Oshima KK, Chuang SC, et al (2025)

A global view of human centromere variation and evolution.

bioRxiv : the preprint server for biology pii:2025.12.09.693231.

Centromeres are essential for accurate chromosome segregation during cell division, yet their highly repetitive sequence has historically hindered their complete assembly and characterization. Consequently, the full spectrum of centromere diversity across individuals, populations, and evolutionary contexts remains largely unexplored. Here, we address this gap in knowledge by assembling and characterizing 2,110 complete human centromeres from a diverse cohort of individuals representing 5 continental and 28 population groups. By developing a novel suite of bioinformatic tools tailored for centromeric regions, we uncover previously unknown variation within centromeres, including 226 novel centromere haplotypes and 1,870 new α-satellite higher-order repeat (HOR) variants. We find that mobile element insertions are present in 30% of centromeres, with chromosome 16 harboring Alu elements within the kinetochore site at an 11-fold higher frequency than expected. While most centromeres have a single kinetochore site, 6% of them have di-kinetochores, and <<1% have tri-kinetochores, which we confirm with long-read CENP-A CUT&RUN, DiMeLo-seq, and multi-generational inheritance. We further show that the position of the kinetochore is not random and is, instead, closely associated with the underlying sequence and structure of the centromere. To understand the nature of evolutionary change, we compared 2,110 complete human centromeres to 5,747 complete centromeres recently assembled from the Human Pangenome Reference Consortium. We show that centromeres have a >50-fold variation in mutation rate, with the most rapidly mutating centromeres on chromosome 1 and the slowest mutating centromeres on chromosome Y. Additionally, a subset of centromeres show evidence of introgression from archaic hominins, shaping their sequence, structure, and evolutionary history. We validate these centromere mutation rates in a four-generation family, spanning 28 family members and 483 accurately assembled centromeres, and show that the kinetochore site is the most rapidly mutating region in the centromere, with twofold more single-nucleotide variants than the rest of the centromeric α-satellite HOR array on average. We propose a model that reveals an 'arms race' between centromeric sequence and proteins, with frequent mutations within the site of the kinetochore that lead to changes in genetic and epigenetic landscapes and, ultimately, rapid evolution of these critically important regions.

RevDate: 2026-01-22

Fei X, Moussa J, Guerra PR, et al (2026)

Comparative pan-genomics and in vivo validation identify genetic factors important for virulence of Salmonella enterica serovar Gallinarum and serovar Enteritidis in the avian host.

Microbiological research, 306:128453 pii:S0944-5013(26)00017-0 [Epub ahead of print].

Salmonella enterica subspecies enterica serovar Gallinarum biovar Gallinarum (SGa) and Pullorum (SPu) are avian-specific pathogens causing systemic disease, while S. Enteritidis (SEnt) is a broad host range serovar causing gastroenteritis. The genomic mechanisms underlying this difference in host range and pathogenicity remain incompletely understood. Here, we performed a large-scale pan-genome analysis of 5440 poultry-derived genomes (4927 SEnt, 106 SGa, 407 SPu) integrated with functional chicken and macrophage experiments. Compared with SEnt, avian-specific SGa and SPu exhibited extensive pseudogenization and shared 87 genes absent in SEnt, organized into four major genomic clusters (PG_1-PG_4) enriched in type VI secretion system genes and prophage-derived elements. Conserved SNPs distinguishing SGa/SPu from SEnt were enriched in carbohydrate and nitrogen metabolism pathways, suggesting potential metabolic divergences during infection. Infection experiments in chickens using deletion mutants revealed that deletions of genes in SPI-2 (ssaE, ssaT) and fimbrial genes (stfA, safA) were important for systemic infection of chicken with both SGa and SEnt, despite pseudogenization of fimbrial operons in SGa. Mutants in SPI-13 and SPI-14 were only significantly attenuated in SGa. The specific prophage region PG_3 was important for systemic infection in SGa, while a distinct prophage element (ENT_2) enhanced infection in SEnt. Together, these findings bridge comparative genomics with experimental validation, identifying genomic degradation, prophage acquisition, and serovar-specific pathogenicity islands as putative mechanisms underlying avian host specificity and systemic pathogenesis in Salmonella.

RevDate: 2026-01-21
CmpDate: 2026-01-21

Rodriguez S, Rey-Varela D, Martinez C, et al (2026)

Genomic plasticity and mobilome architecture of Vibrio europaeus reveal key mechanisms of evolutionary adaptation.

Microbial genomics, 12(1):.

Vibrio europaeus has emerged as a significant pathogen in shellfish aquaculture, causing mass mortality outbreaks in key bivalve species and leading to severe economic losses for the industry. Studies on the structure and characteristics of the accessory genome in aquaculture pathogens remain scarce, despite its crucial role in evolutionary and ecological adaptation. The accessory genome provides indeed genetic variability that enables rapid responses to environmental challenges, host adaptation and selective pressures such as antibiotics or phage predation. Here, we present the first comprehensive comparative genomic analysis of the V. europaeus pangenome to investigate the structural organization and functional content of its accessory genome. The soft mobilome of V. europaeus comprises 73% of accessory genes and 44% of the total pangenome, including non-chromosomic (plasmids) and chromosomic genetic elements such as prophages, integrative and conjugative/mobilizable elements, phage satellites and other mobile genetic elements (MGEs) designated as unclassified chromosomic regions of genomic plasticity (unclassified chromosomic RGPs). Among accessory elements, unclassified chromosomic RGPs were the primary drivers of evolutionary dynamics in V. europaeus, acting as the main genetic reservoir of anti-phage defence systems and antimicrobial resistance genes. Notably, the identification of abundant insertion hotspots in chromosomic genetic elements facilitates the rapid acquisition of anti-phage defence systems, thereby enabling rapid turnover of these systems and enhancing host fitness. In addition, novel pVE1-like plasmids (>300 kb) - only found in this species and its closest relative Vibrio tubiashii - emerged as the largest and most ubiquitous MGEs in V. europaeus. These plasmids encode the highest number of virulence genes and secondary metabolite biosynthetic genes, as well as a remarkable diversity of anti-phage defence systems among closely related strains. Although the genome dataset analysed here is limited to strains isolated from moribund/dead animals in aquaculture environments, this study provides new insights into the role of accessory genetic elements in the evolution, adaptation and diversification of the shellfish pathogen V. europaeus. The findings reveal the complexity and plasticity of its pangenome and highlight the importance of RGPs and plasmids in bacterial fitness.

RevDate: 2026-01-21
CmpDate: 2026-01-21

More R, Yadav V, N Vadakedath (2026)

Calyptranema fuscum gen. sp. nov.: a novel cyanobacterial genus within Oculatellaceae based on polyphasic and genomic characterization.

Current research in microbial sciences, 10:100542.

This study presents a comprehensive characterization and classification of a novel cyanobacterial isolate, strain S582, proposed as Calyptranema fuscum gen. sp. nov. within the family Oculatellaceae using an integrated polyphasic approach. Strain S582 was isolated from a lake in the Botanical Garden, Sarangpur, Chandigarh, India. Initial molecular characterization with the 16S rRNA gene revealed ≤ 94.90% of similarity with related genera and showed unique 16S-23S ITS secondary structures, indicating its delineation as a novel genus. Morphological assessment highlighted the presence of a cap-like structure called calyptra at the terminal cells, further distinguishing it from related genera. Furthermore, whole genome sequencing yielded an assembly of 7962,515 bp with GC content of 48.27%. Genome-based analysis encompassing average nucleotide identity (ANI), average amino acid identity (AAI), percentage of conserved proteins (POCP) was subsequently performed. The observed values for ANI (71.15% to 73.00%) and AAI (63.30% to 69.62%) were below the established genus-level thresholds. Phylogenetic analysis using maximum-likelihood and Bayesian inference along with phylogenomic reconstruction based on 1434 single copy core genes supported its taxonomic novelty. Functional classification revealed the presence of biosynthetic gene clusters (BGCs), tRNAs, insertion elements, CRISPR/Cas systems, and genes associated with metabolism, carbon fixation and photosynthesis. Additionally, the pangenome was constructed to study the genomic diversity of the studied isolate and related genera among the Oculatellaceae family and identified core, accessory, and singleton gene clusters. Collectively, these findings establish Calyptranema fuscum gen. sp. nov. as a novel genus within Oculatellaceae while expanding our understanding of cyanobacterial diversity and genomic potential.

RevDate: 2026-01-20

Skiadas P, Mendel MN, Elberse J, et al (2026)

Pangenome graph analysis reveals evolution of resistance breaking in spinach downy mildew.

PLoS biology, 24(1):e3003596 pii:PBIOLOGY-D-25-01888 [Epub ahead of print].

Filamentous plant pathogens secrete effectors to successfully establish host infections. In resistant crop varieties, plant immunity can be triggered by immune receptors that recognize these effectors. Resistant crop varieties are grown in large-scale monocultures imposing strong selection pressure on pathogens, driving rapid evolution of effector repertoires resulting in the frequent breakdowns of resistance within just a few growing seasons. The oomycete Peronospora effusa, responsible for downy mildew on spinach, is an example of a rapidly adapting pathogen, but it is yet unknown how P. effusa can successfully overcome resistance of spinach by genomic adaptations. To close this knowledge gap, we here generated genome assemblies and constructed a pangenome graph for 19 isolates corresponding to 19 officially denominated resistance-breaking P. effusa races, which can cause disease on a differential set of spinach cultivars. Haplotype-resolved pangenome graph analyses revealed that many isolates emerged from recent sexual recombination, yet others evolved via prolonged asexual reproduction and loss of heterozygosity. By phasing effector candidates to determine their allelic variation, we identified effector candidates associated to resistance breaking of spinach varieties and reconstructed the evolutionary events that led to their diversification. The here developed and applied computational genomics approaches offer invaluable insights into the molecular mechanisms of the rapid evolution of P. effusa, and points to potential targets for future resistance breeding.

RevDate: 2026-01-20

Harrand AS, Skeens J, Carroll L, et al (2026)

Listeria sanitizer tolerance at use-level concentrations shows limited association with genetic loci.

Applied and environmental microbiology [Epub ahead of print].

The ability of Listeria to show reduced susceptibility to sanitizers commonly used in fresh produce packing and processing environments continues to be mentioned as a concern. We assessed the survival of 501 produce-associated Listeria isolates (328 Listeria monocytogenes [LM] and 173 Listeria spp. [LS]) after 30 s of exposure to benzalkonium chloride (BC, 300 ppm) and peroxyacetic acid (PAA, 80 ppm). A subset of 108 isolates was also exposed to sodium hypochlorite (NaOCl, 500 ppm) for 30 s. Isolates showed a range of log reductions, including 2.76-5.73 log for BC, 0.15-6.16 log for PAA, and 1.34-7.02 log for NaOCl; the variance of log reductions was significantly lower for BC compared to PAA and NaOCl. Cluster analysis on log reduction data identified four clusters, including one cluster of five LM isolates that showed reduced susceptibility to all three sanitizers. Log reductions of LS were significantly lower than LM after exposure to PAA, indicating reduced PAA susceptibility among LS. Whole genome sequence (WGS)-based characterization of all isolates revealed that the presence of known BC resistance genes (i.e., bcrABC, mdrL, and sugE1/2) was not significantly associated with log reductions to BC, and the presence of stress survival islet SSI-2 was not significantly associated with log reductions to PAA and NaOCl. Genome-wide association studies did not reveal any association of pangenome genes with phenotypic sanitizer susceptibility but identified several SNPs in core genes as associated with sanitizer susceptibility.IMPORTANCEDespite frequently stated concerns about LM and LS with reduced susceptibility to sanitizers (which could facilitate persistence and increase risk of product contamination), there are limited data available on Listeria susceptibility to sanitizers used in produce packing and processing environments at their recommended use-level concentrations. Importantly, our data showed that reduced sanitizer susceptibility of Listeria is not linked to the presence of any previously reported sanitizer resistance genes. However, we identified a group of five LM isolates that showed reduced susceptibility to all three sanitizers tested; these isolates represented lineages I, II, and III. Combined, these data suggest that there are no distinct "sanitizer-resistant" clonal Listeria groups and that WGS data may not be particularly valuable for predicting sanitizer susceptibility at use-level concentrations. Moreover, the high variability of log reductions observed across all three sanitizers highlights the importance of considering log reduction variability, in addition to average log reduction, when assessing different sanitizers.

RevDate: 2026-01-20
CmpDate: 2026-01-20

Chen J, Ling D, Wang F, et al (2026)

Septic Shock Caused by Coinfection of Shewanella algae Bloodstream Infection and Epstein-Barr Virus: Clinical Characteristics and Genomic Analysis.

MicrobiologyOpen, 15(1):e70221.

Shewanella algae, a marine-origin opportunistic pathogen, has shown a significant increase in non-coastal infections, yet its environmental adaptability and synergistic pathogenic mechanisms with Epstein-Barr virus (EBV) coinfection remain unclear. This study reports a clinical case of S. algae bloodstream infection complicated by EBV reactivation leading to septic shock in Sichuan Province, China, and elucidates the molecular mechanisms through genomic analysis. Pathogen identification was performed via blood culture, antibiotic susceptibility testing, and genomic annotation. The strain harbored resistance genes (acrB, tolC, tet(35), golS) and virulence factors (bplL/bplF, clpC/clpP, tonB). Phylogenetic analysis indicated the highest genetic affinity to freshwater-derived Shewanella chilikensis, while pan-genome analysis identified 1412 unique genes, including transmembrane transporters and carbohydrate-active enzyme genes, suggesting freshwater adaptive evolution. Metagenomic next-generation sequencing (mNGS) detected a high EBV load. The patient succumbed to multi-organ failure. This study reveals that S. algae may evolve freshwater adaptability to cause inland infections, and EBV coinfection accelerates septic shock through immunosuppression and inflammatory cascades. Genomic analysis provides critical insights for precision diagnosis and treatment of polymicrobial infections.

RevDate: 2026-01-19

Edwards D (2026)

On the use and misuse of pangenome and related terms.

Nature communications pii:10.1038/s41467-026-68624-9 [Epub ahead of print].

RevDate: 2026-01-19
CmpDate: 2026-01-19

van Hal SJ, Jenkins F, Hogan TR, et al (2026)

Gene exchange between Neisseria meningitidis and Neisseria gonorrhoeae.

Microbial genomics, 12(1):.

Genetic exchange between Neisseria meningitidis (NM) and Neisseria gonorrhoeae (NG) has not been well studied. This study aimed to investigate evidence of genetic exchanges between these two species. All coincident paired NM and NG isolates cultured from pharyngeal swabs collected from a sexual health clinic in Sydney in 2021 underwent whole-genome sequencing. A gene-by-gene analysis of the 47 NM-NG pairs identified 184 instances where the ancestry of the gene revealed intermixing between the two species. Incorporating the gene phylogenies demonstrated that these events occurred across a wide range of timeframes. At the nucleotide level, 91 genes were found where paired isolates harboured identical sequences. Notably, one instance of unequivocal recent gene transfer events between the paired pharynx isolates was observed. This work provides new insights into the evolutionary dynamics of these bacteria and highlights the importance of genetic exchange in populations with high rates of pharyngeal gonorrhoea. The clinical implications of such exchanges call for continued vigilance and research to address the challenges posed by these bacteria.

RevDate: 2026-01-19

Gladman N, Olson A, Kumari S, et al (2026)

SorghumBase: a knowledgebase for sorghum genomics, phenomics, and stakeholder engagement.

Genetics pii:8429854 [Epub ahead of print].

Centralizing valuable community data and resources into a user-friendly interface and accessible repository has become essential for agricultural science; embracing Findable Accessible, Interoperable, and Reusable (FAIR) principles is now standard for effective databases. SorghumBase (https://www.sorghumbase.org) is a knowledgebase designed for the sorghum research community. The SorghumBase team curates genomic, transcriptomic, variation, and phenotypic information and aggregates community events, providing rich visualizations and bulk data access. The modular framework of the database is built with open-access software to yield a robust, modifiable, and sustainable data infrastructure. Release 9 of SorghumBase includes: (i) 88 sorghum reference genomes and an updated pan-gene index, (ii) over 100 million variants have been mapped onto the 2 genomes, BTx623 and Tx2783, (iii) assignment of 41 million Reference Cluster SNP identifiers (rsIDs) from BTx623 across the pan-genome, (iv) updated gene search homology, gene expression, and germplasm visualizations and features, (v) added and standardized 234 phenotypic data from 40 community-generated GWAS studies and 148 traits from the Sorghum QTL Atlas (Oz Sorghum), (vi) improved news, funding, and a research content management system for community access and interaction, (vii) outreach materials including training documents and videos, and (viii) community engagement initiatives through training and working groups. SorghumBase serves as a hub for sorghum data and stakeholder engagement while promoting community standards to drive research and multi-omics breeding approaches.

RevDate: 2026-01-16

V R, Kukreti A, Prasannakumar MK, et al (2026)

Pan-genome and antibiotic resistance insights into Xanthomonas citri pv. punicae pathotypes.

BMC microbiology pii:10.1186/s12866-025-04625-w [Epub ahead of print].

RevDate: 2026-01-16
CmpDate: 2026-01-16

Jakubickova M, Sabatova K, Zbudilova M, et al (2025)

MethylomeMiner: A novel tool for high-resolution analysis of bacterial methylation patterns from nanopore sequencing.

Computational and structural biotechnology journal, 27:4753-4759.

DNA methylation plays a key role in gene regulation, genome stability, bacterial adaptation, and many other essential cellular processes. Thanks to nanopore sequencing technology, it is now possible to detect these modifications during sequencing without any prior chemical treatment. However, methylation data processing and their interpretation in a biological context remain challenging as there are no convenient and easy-to-use tools available for this purpose. Therefore, here, we present a simple Python-based tool, MethylomeMiner, to process methylation calls from nanopore sequencing. The tool allows high-confidence methylation sites to be selected based on coverage and methylation rate and assigned to coding or non-coding regions using genome annotation. In addition, the tool supports population-level analysis using pangenome data to compare methylation patterns across multiple bacterial genomes. Altogether, MethylomeMiner provides a straightforward and reproducible workflow that can be easily integrated into existing analyses and helps uncover the functional roles of DNA methylation in bacterial genomes.

RevDate: 2026-01-16
CmpDate: 2026-01-16

Han S, Park JY, Han YH, et al (2026)

Genome-wide regulon of NtrC reveals genetic regulation under nitrogen limitation in Methylomonas sp. DH-1.

iScience, 29(1):114322.

Nitrogen is an essential element, but its scarcity often leads to growth constraints, driving the development of metabolic pathways and regulatory mechanisms. Here, we employed an integrated multi-omics approach to analyze the transcription factor NtrC and its regulon, elucidating the transcriptional adaptive response of Methylomonas sp. DH-1 to nitrogen limitation. An integrative analysis of ChIP-exo and RNA-seq revealed that 19 genes are directly regulated by NtrC. The NtrC regulon includes genes for glutamine synthesis, nitrite reduction, and formate/nitrite transport, suggesting a role in nitrogen assimilation. In-depth analysis revealed that various nitrogen metabolic pathways are regulated to coordinate with NtrC's role by increasing flux to ammonium. Additionally, pan-genome analysis confirmed that glutamine synthesis and nitrite metabolism are conserved as primary functions of NtrC within the genus Methylomonas. This study provides deeper insights into the transcriptional regulation strategies of methanotrophs under nitrogen-limited conditions.

RevDate: 2026-01-16
CmpDate: 2026-01-16

Fukasawa Y (2026)

Benchmarking long-read variant calling in diploid and polyploid genomes: insights from human and plants.

BMC genomics, 27(1):46.

Accurate characterization of genetic variation is fundamental to genomics. While long-read sequencing technologies promise to resolve complex genomic regions and improve variant detection, their application in complex genomes has not been well validated. Here, we systematically investigate the factors influencing variant calling accuracy using accurate long reads. Using human trio data with known variants to simulate variable ploidy levels (diploid, tetraploid, hexaploid), we demonstrate that while variant sites can often be identified accurately, genotyping accuracy decreases with increasing ploidy due to allelic dosage uncertainty. This highlights a specific challenge in assigning correct allele counts in polyploids even with high depth, separate from the initial variant discovery. We then assessed genotyping and variant detection performance in real genomes with varying complexity: the relatively simple diploid Fragaria vesca, the tetraploid Solanum tuberosum, and the highly repetitive diploid Zea mays. Our results reveal that overall variant calling accuracy is influenced strongly by inherent genome complexity (e.g., repeat content). Furthermore, we identify a critical mechanism impacting variant discovery: structural variations between the reference and sample genomes, particularly those containing repetitive elements, can induce spurious read mapping. This effect is likely exacerbated by the length and accuracy of long reads. This leads to false variant calls, constituting a distinct and more dominant source of error than allelic-dosage uncertainty. Our findings underscore the multifaceted challenges in long-read variant analysis and highlight the need for ploidy-aware genotypers and bias-aware mapping strategies to fully realize the potential of long reads in diverse organisms.

RevDate: 2026-01-15

Belay KH, Abdelrazek S, Kaur S, et al (2026)

Genomic insights into Ceratobasidium sp. associated with vascular streak dieback of woody ornamentals in the United States using a metagenomic sequencing approach.

Microbiology spectrum [Epub ahead of print].

UNLABELLED: Woody ornamentals are integral to urban landscapes and play important roles in habitat restoration and ecological conservation, yet their national and international trade facilitates the spread of plant diseases with significant ecological and economic consequences. Vascular streak dieback (VSD) recently emerged on woody ornamentals in the United States and was found to be associated with the fungal pathogen Ceratobasidium sp. (Csp), but little is known about its genomic diversity and associated microbial communities. We thus applied metagenomic sequencing to 106 symptomatic samples that had tested positive for Csp and had been collected from 34 woody ornamental species in seven states. Taxonomic profiling identified Csp as the only putative pathogen of which we recovered 17 high-quality draft genomes. Phylogenomic and pangenome analyses revealed that U.S. Csp isolates form a tight genetic cluster, distinct in gene content from C. theobromae, a pathogen of cacao, avocado, and cassava in Southeast Asia. Comparative analyses highlighted gene content differences, including candidate effectors and secondary metabolite clusters, which may underlie host interactions and offer diagnostic targets. These findings provide the first genomic insights into the U.S. Csp population, suggest the recent introduction of a single genetic lineage with a broad host range, and establish a framework for improved detection, monitoring, and management of VSD in woody ornamentals.

IMPORTANCE: Identification of the pathogen that causes an emerging disease, be it of humans, animals, or plants, is a prerequisite to develop effective treatment and/or management practices and to try to control the disease outbreak to prevent further pathogen spread. Vascular streak dieback (VSD) is an emerging disease of ornamental bushes and trees in the United States. Identification of the pathogen has been hindered by the difficulty in growing the fungal pathogen found to be associated with diseased plants in pure culture. Here, we succeeded in sequencing the DNA of the likely pathogen directly from plant tissue or from the fungal mass growing out of collected plant tissue. The sequences were assembled into genomes, which allowed us to precisely identify the pathogen, compare it to related pathogens of other plants, and predict how it causes disease. These results can now be used to inform management and control of VSD.

RevDate: 2026-01-15

Santos LBD, Showmaker KC, Masonbrink RE, et al (2026)

Pangenome analysis of nine soybean cyst nematode genomes reveals hidden variation contributing to diversity and adaptation.

BMC genomics pii:10.1186/s12864-025-12493-x [Epub ahead of print].

BACKGROUND: The soybean cyst nematode (SCN) is a persistent threat to soybean production. SCN populations continually overcome resistant cultivars, causing significant yield losses. Studies conducted with a single reference genome restrict our understanding of intraspecific diversity, masking significant mechanisms of virulence evolution and host adaptation. Here we report a pangenome constructed of nine SCN populations of different pathotypes, including eight newly generated high-fidelity genome assemblies.

RESULTS: We detected over 19,000 orthologous gene families and more than 12,000 putative secreted proteins in SCN. Combined, these data indicate substantial diversity across populations. Gene content analysis showed that 35% of gene families were the conserved core, 15% were soft-core, and 48% were accessory. Evidence of rapid evolution was identified in a high portion (40%) of core single-copy genes, most notably inside the protein domains responsible for host recognition and immune modulation. Analysis of gene-family expansion revealed extensive duplication and loss across lineages, suggesting ongoing paralog turnover within SCN populations. Finally, a graph-based pangenome enabled the identification of numerous structural variants within regions under selection.

CONCLUSIONS: Our study highlights substantial genetic variation in SCN that is not captured by single-reference analyses. By integrating multiple high-quality assemblies, we show that the SCN genome is highly dynamic, with extensive gene duplication and loss as well as structural variation shaping the differences among nematode populations. Collectively, the SCN pangenome provides a robust resource for studying virulence and adaptation mechanisms in SCN and establishes a genomic foundation for the development of more precise management strategies.

RevDate: 2026-01-14

Zhang W, Liu Y, Li G, et al (2026)

Strain-level metagenomic profiling using pangenome graphs with PanTax.

Genome research pii:gr.280858.125 [Epub ahead of print].

Microbes are omnipresent, thriving in a range of habitats, from oceans to soils, and even within our gastrointestinal tracts. They play a vital role in maintaining ecological equilibrium and promoting the health of their hosts. Consequently, understanding the diversity in terms of strains in microbial communities is crucial, as variations between strains can lead to different phenotypic expressions or diverse biological functions. However, current methods for taxonomic classification from metagenomic sequencing data have several limitations, including their reliance solely on species resolution, support for either short or long reads, or their confinement to a given single species. Most notably, most existing strain-level taxonomic classifiers rely on the sequence representation of multiple linear reference genomes, which fails to capture the sequence correlations among these genomes, potentially introducing ambiguity and biases in metagenomic profiling. Here, we present PanTax, a pangenome graph-based taxonomic profiler that overcomes the shortcomings of sequence-based approaches, because pangenome graphs possess the capability to depict the full range of genetic variability present across multiple evolutionarily or environmentally related genomes. PanTax provides a comprehensive solution to taxonomic classification for strain resolution, compatibility with both short and long reads, and compatibility with single or multiple species. Extensive benchmarking results demonstrate that PanTax drastically outperforms state-of-the-art approaches, primarily evidenced by its significantly higher F1 score at the strain level, while maintaining comparable or better performance in other aspects across various data sets.

RevDate: 2026-01-14
CmpDate: 2026-01-14

Pang J, Wei Z, Zhang Z, et al (2026)

Genomic Landscape Reveals Correlation of Endosymbiont Ralstonia With Acanthamoeba Keratitis Severity.

Investigative ophthalmology & visual science, 67(1):17.

PURPOSE: To identify the basic genomic profile of Acanthamoeba, obtain information on Acanthamoeba endosymbionts, and analyze the correlation between these endosymbionts and the prognosis of Acanthamoeba keratitis (AK) patients.

METHODS: Whole-genome sequencing was conducted on 30 cornea-derived Acanthamoeba strains. Pan-genome analysis was performed, and endosymbionts were identified by metagenomic analysis. Gimenez staining, fluorescence in situ hybridization, and transmission electron microscopy were used to prove the existence of endosymbionts. Linear discriminant analysis effect size was used to associate endosymbiont species with AK clinical prognosis. The correlation between the endosymbiont Ralstonia and pathogenicity was experimentally validated by assessing the biological characteristics of Acanthamoeba and by performing clinical and histopathological evaluations in AK mouse models.

RESULTS: Whole genome sequencing revealed that the Acanthamoeba genome size was 37.1-105.0 Mb and GC content was 53.9%-60.5%. Pan-genomic analysis indicated an open state of the Acanthamoeba genome. Metagenomic analysis identified the presence of endosymbionts within Acanthamoeba, notably the endosymbiont Ralstonia, which was associated with poor prognosis at the genus level (P = 0.047). Acanthamoeba harboring the endosymbiont Ralstonia exhibited an increased migration area, enhanced adhesion, and had a more pronounced cytopathic effect. The size of clinical scores and corneal ulcers showed a significant increase in mouse models induced by Acanthamoeba with endosymbiont Ralstonia.

CONCLUSIONS: Whole-genome sequencing highlighted the symbiotic relationship between Acanthamoeba and associated microorganisms. The presence of the endosymbiont Ralstonia influenced the biological characteristics of Acanthamoeba and was correlated with clinical poor prognosis in AK, suggesting its potential as a target for clinical intervention.

RevDate: 2026-01-14

Ricci ML, Fillo S, Giordani F, et al (2026)

Genomic characterization of Legionella pneumophila serogroup 1 ST901 isolates responsible for recurrent travel-associated Legionnaires' disease cases and clusters.

Pathogens and global health [Epub ahead of print].

Cases of travel-associated Legionnaires' disease (TALD) are frequently reported in Italy. From 1987 to 2021, 61 cases of TALD were linked to 22 hotels in a municipality in northern Italy. Legionella pneumophila serogroup 1 (Lp1) strains isolated from both patients and hotel water systems were identified as sequence type (ST) 901, a genotype rarely associated with travel-related infections in Italy or elsewhere. Whole-genome sequencing was used to analyze 41 isolates, and phylogenetic relationships were inferred by core genome multilocus sequence typing (cgMLST), single nucleotide polymorphisms (SNP) and pangenome analyses. The Lp ST901 isolates were found to form a clade characterized by some accessory genomic islands (AGI) already described in other epidemic strains, such as Alcoy, Corby, Paris and Philadelphia; other islands, containing either transposase/recombinase or transcriptional regulator factors or Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-cas systems, were specific to Lp ST901. Lp ST901 also appears to have undergone possible recombination with other strains, such as Lp ST47 (Lorraine strain). Additionally, CRISPR-Cas systems may have contributed to the protection of Lp ST901 from external dangers, while the colonized hotel water systems may have provided an ideal environmental protective niche. Our findings highlight that Lp ST901 has public health significance and deserves attention in Legionnaires' disease surveillance and risk assessment.

RevDate: 2026-01-14
CmpDate: 2026-01-14

Lee HG, Song JY, Yoon J, et al (2026)

metaFun: An analysis pipeline for metagenomic big data with fast and unified functional searches.

Gut microbes, 18(1):2611544.

Metagenomic approaches offer unprecedented opportunities to characterize microbial community structure and function, yet several challenges remain unresolved. Inconsistent genome quality impairs reliability of metagenome-assembled genomes, lack of unified taxonomic criteria limits cross-study comparability, and multi-step workflows involving numerous programs and parameters hinder reproducibility and accessibility. We benchmarked existing programs and parameters using simulated metagenomic data to identify optimal configurations. metaFun is an open-source, end-to-end pipeline that integrates quality control, taxonomic profiling, functional profiling, de novo assembly, binning, genome assessment, comparative genomic analysis, pangenome annotation, network analysis, and strain-level microdiversity analysis into a unified framework. Interactive modules support standardized data interpretation and exploratory visualization. The pipeline is implemented with Nextflow and containerized with Apptainer, ensuring environment reproducibility and scalability. Comprehensive documentation is available at https://metafun-doc.readthedocs.io/en/main. The pipeline was validated using a colorectal cancer cohort dataset. By addressing key methodological gaps, metaFun facilitates accessible and reproducible metagenomic analysis for the broader research community.

RevDate: 2026-01-13

Ajumobi V, Tahir Z, Hayes P, et al (2026)

Population structure, antimicrobial resistance, and virulence factors of diabetic foot-associated Escherichia coli.

Microbiology spectrum [Epub ahead of print].

Diabetic foot infections (DFIs) are a major complication of diabetes, often leading to lower limb amputations. Escherichia coli is a predominant Gram-negative pathogen in DFI, yet its genomic and pathogenic features remain poorly characterized. Here, we present a whole-genome sequence-based analysis of diabetic foot-associated E. coli (DFEC) isolates from diverse geographical locations. Phylogenetic reconstruction revealed substantial diversity, with strains spanning 7 phylogroups and 28 sequence types. Capsule biosynthesis loci linked to invasive infections, such as K1, K2ab, and K5, were also detected. The DFEC pangenome comprised 18,263 gene clusters, indicating high genomic plasticity. The plasmid repertoire was also varied and contributed to the genomic diversity of the strains. Approximately 78% of isolates were multidrug-resistant or extensively drug-resistant, with resistance to last-resort antibiotics such as colistin and carbapenems also observed. High frequencies of virulence factors involved in host cell adherence, iron metabolism, serum survival, as well as toxins and type 3 secretion system genes were also detected. In contrast, metabolic modeling showed conserved biochemical profiles. Clustering based on accessory metabolic functions did not mirror phylogeny, suggesting metabolic convergence among distinct lineages. Collectively, these findings reveal that DFEC are versatile pathogens with a repertoire of antimicrobial resistance and virulence determinants. These traits make them functionally distinct from commensal E. coli strains and highlight the potential of DFEC to cause severe and invasive infections.IMPORTANCEThis study presents the first multisite genomic characterization of diabetic foot-associated Escherichia coli (DFEC). Our findings reveal that DFEC strains are phylogenetically diverse and span multiple lineages. The high prevalence of multidrug-resistant and extensively drug-resistant genotypes underscores the underestimated antimicrobial resistance (AMR) threat posed by DFEC. We detect high frequencies of virulence factors commonly associated with extraintestinal pathogenic E. coli, which indicates that DFEC might have the potential to cause severe complications, such as sepsis. The large accessory genome and evidence of metabolic convergence across distinct lineages highlight the adaptive versatility of DFEC in the polymicrobial and inflammatory environment of chronic wounds. These insights advance our understanding of DFEC pathobiology and support the development of targeted diagnostics, AMR surveillance, and therapeutic strategies to improve clinical outcomes for diabetic patients.

RevDate: 2026-01-12

Walia S, Motwani H, Tseng YH, et al (2026)

Compressive pangenomics using mutation-annotated networks.

Nature genetics [Epub ahead of print].

Pangenomics is an emerging field that uses collections of genomes, rather than a single reference, to reduce bias and capture intra-species diversity. However, existing pangenomic data formats face challenges in scaling to millions of genomes and primarily emphasize variation, often neglecting the underlying mutational events and evolutionary relationships. This work introduces Pangenome Mutation-Annotated Network (PanMAN), a lossless pangenome representation that achieves compression ratios ranging from 3.5-1,391× in file sizes compared to existing variation-preserving formats, with performance generally improving on larger datasets. In addition to compression, PanMAN increases representational capacity by encoding detailed mutational and evolutionary histories inferred across genomes, thereby enabling new biological insights. Using PanMAN, a comprehensive SARS-CoV-2 pangenome was constructed from 8 million publicly available sequences, requiring only 366 MB of disk space. We also present 'panmanUtils', a toolkit that supports common analyses and ensures interoperability with existing software. PanMAN is poised to greatly improve the scale, speed, resolution and scope of pangenomic analysis and data sharing.

RevDate: 2026-01-12
CmpDate: 2026-01-12

Wang F, Liu Q, Ghonimy A, et al (2026)

Genomic architecture and transcriptional regulation of cellulose degradation in the novel marine bacterium Pseudoxanthomonas sp. JC1303.

Molecular genetics and genomics : MGG, 301(1):19.

Microbial degradation of cellulose is a fundamental process driving the global carbon cycle and holds immense potential for sustainable biotechnology; however, the genomic mechanisms and transcriptional regulation underlying this capability in marine environments remain largely underexplored. To decipher these complex biological strategies, we isolated the novel strain JC1303 from marine sediments and integrated whole-genome sequencing with transcriptomic analysis to systematically characterize its enzymatic arsenal and metabolic adaptations. Whole-genome sequencing revealed that strain JC1303 possesses a circular chromosome of 4.37 Mb in length, with a GC content of 67.41%. Phylogenetic analyses based on the 16 S rRNA gene and whole-genome data suggest that strain JC1303 likely represents a new species within the genus Pseudoxanthomonas. Pan-genome analysis of the genus demonstrates a typical "open" genome architecture with only 3% conserved core genes, highlighting high evolutionary plasticity. In contrast, strain JC1303 has 936 unique genes significantly enriched in metabolism (163 genes) and signal transduction (138 genes), providing a molecular basis for its adaptation to the cellulose degradation niche. Genome mining identified a complete cellulolytic system comprising three endo-β-1,4-glucanases, two cellulase, and four β-1,4-glucosidase, supported by glycolysis/gluconeogenesis, TCA cycle, pentose phosphate pathway, amino acid synthesis pathways, ABC transport systems, and the respiratory chain. Crucially, comparative transcriptomic profiling under cellulose induction validated the functional execution of this genetic potential. Among 1465 differentially expressed genes, the strain exhibited a coordinated strategy: while distinct isozymes were downregulated, a key endoglucanase gene (JC1303_01352) and multiple membrane transporter genes were significantly upregulated. This suggests a specific mechanism coupling extracellular hydrolysis with efficient substrate uptake. In conclusion, this study not only elucidates the genetic blueprint and transcriptional regulation of a new marine cellulolytic species Pseudoxanthomonas JC1303 but also offers theoretical support for engineering robust biocatalysts.

RevDate: 2026-01-10
CmpDate: 2026-01-10

Davina-Nunez C, Rincon-Quintero A, Potel C, et al (2026)

Pangenomic analysis reveals metabolic adaptation of Haemophilus parainfluenzae to the urogenital tract.

Virulence, 17(1):2613506.

Haemophilus parainfluenzae (Hpar) is a common colonizer found in the upper respiratory tract, although recently urogenital colonization has emerged as a clinical concern. Urogenital Hpar has been associated with increased antibiotic resistance and virulence compared to respiratory Hpar. We analyzed the genome of 270 Hpar isolates, including all sequencing data found in the NCBI sequence read archive database. The pangenome of respiratory and urogenital isolates were compared in order to find potential metabolic or pathogenic adaptations to different host environments. The pangenome-wide association study found significant genomic differences. Specifically, the two-component signal transduction system was significantly enriched in urogenital samples, which could explain the adaptations of Hpar to the unique physico-chemical conditions of the urethra. Additionally, the two-component system could work as a new target for antimicrobials against pathogenic Hpar. The polysaccharide capsule, the main virulence factor in Haemophilus spp. was present in 26/65 of the urogenital samples from our facility, an increase from previous studies. In summary, the data presented suggest that respiratory and urogenital isolates of Hpar belong to different genetic lineages, and therefore it is possible that unprotected oral sex is not the route of transmission of Hpar from the respiratory tract to the urethra. Given the limited amount of available sequences, future studies collecting more isolates from different spatiotemporal locations would shed more light on this issue.

RevDate: 2026-01-10
CmpDate: 2026-01-10

Xu L, Miao T, Cui Z, et al (2026)

Synteny-based comparative pan-genome reveals a male-specific FT gene underlying flowering time dimorphism in kiwifruit.

The Plant journal : for cell and molecular biology, 125(1):e70664.

Actinidia spp. (kiwifruit) are functionally dioecious with separate male and female individuals that exhibit a subtle difference in flowering time. However, this sexually dimorphic trait, along with its evolutionary history and the role of sexually antagonistic selection, remains to be fully understood. To investigate the underlying causes of this dimorphism at the genus level, we conducted a comparative pan-genome analysis of two representative kiwifruit species, Actinidia chinensis and Actinidia eriantha, using 10 chromosome-scale genome assemblies from six distinct male and female genotypes. The construction of the pan-genome revealed a total of 52 774 non-redundant pan-gene orthogroups comprising 42 370 gene clusters and 10 404 unassigned genes. Building on this, the comparative analysis further identified 657 syntenic gene sets belonging to 595 pan-gene orthogroups that exhibit significant inter-sexual divergence. It is plausible that they are drivers of the antagonistic selection responsible for conserved sexually dimorphic traits across the extant kiwifruit species. One of these genes is Y-linked FT (YFT), a male-specific variant of FLOWERING LOCUS T that originated following the recent whole-genome duplication event. We demonstrate that A. eriantha YFT is able to promote flowering, suggesting that it may contribute to the sexual dimorphism in kiwifruit flowering time. Notably, YFT is located in the sex-determining region (SDR), with linkage to the sex-determining genes, facilitating its spread in natural populations and SDR's expansion on the Y chromosome. For ease of use and analysis, the entire comparative pan-genome workflow was integrated into a custom Perl script, SynPanScan. This approach helps decipher the genetic basis of flowering sexual dimorphism in kiwifruit and establishes a synteny-based comparative pan-genomic framework for investigating the heritable architecture of natural phenotypic variation.

RevDate: 2026-01-10
CmpDate: 2026-01-10

Evseev PV, Podoprigora IV, Chaplin AV, et al (2025)

Bulleidia extructa PP_925: Genome Reduction, Minimalist Metabolism, and Evolutionary Insights into Firmicutes Diversification.

International journal of molecular sciences, 27(1): pii:ijms27010448.

Bulleidia extructa strain PP_925, isolated from the periodontal pocket of a patient with periodontitis, is a Gram-positive Bacillota with an unusually compact genome of 1.38 Mb. Phylogenomic analyses place PP_925 within Erysipelotrichales and show close relatedness of Bulleidia to Solobacterium and Lactimicrobium, as well as the existence of previously undescribed related clades. The metabolic repertoire of PP_925 is strongly reduced: it retains glycolysis, the phosphotransacetylase-acetate kinase pathway, and arginine catabolism but lacks the tricarboxylic acid cycle and most de novo biosynthetic pathways for amino acids, nucleotides, fatty acids, cofactors, and vitamins, implying reliance on salvage and cross-feeding. Phylogenetic inference indicates independent peptidoglycan losses in multiple mycoplasma Erysipelotrichia-related lineages, while PP_925 has retained an ancestral Gram-positive cell wall despite extensive genomic reduction. The genome preserves systems crucial for host interaction and adaptability, including a horizontally acquired tad locus encoding type IV pili, a comG competence system, and several adherence-associated virulence factors. Defense mechanisms are diverse and include a CRISPR-Cas II-A system, a type II restriction-modification module adjacent to Gao_Qat-like genes, and the Wadjet system in a genome without prophages; CRISPR spacers indicate repeated encounters with Bacillota phages. Comparative genomics of PP_925 and related strains reveals a small core genome with lineage-specific adhesion and defense modules, indicating recent shared ancestry combined with adaptive flexibility under substantial genome reduction.

RevDate: 2026-01-10
CmpDate: 2026-01-10

Zhang S, Lan R, Zhao R, et al (2025)

Genomic and Metabolomic Insights Into the Probiotic Potential of Weissella viridescens.

Biology, 15(1): pii:biology15010063.

Weissella viridescens has been proposed as a probiotic candidate, but strain-level multi-omics evidence remains limited. The complete genome of the human-derived W. viridescens strain Wv2365 was sequenced through a hybrid assembly of Illumina and PacBio sequencing reads and compared with eight publicly available W. viridescens genomes. Pangenome analysis and functional annotation were performed, and metabolites were profiled by broadly targeted metabolomic analysis. In addition, the acid and bile tolerance, auto-aggregation and cell surface hydrophobicity, and antioxidant activity of the strain, as well as both in silico and phenotypic safety, were assessed. Wv2365 carries a single chromosome of 1.57 Mb with 41.3% G+C content. The species has an open pangenome with 803 core genes. Genomic and metabolomic features converged on carbohydrate and amino acid metabolism, including glycolysis/tricarboxylic acid (TCA) cycle and arginine pathways, and a carbohydrate-active enzyme (CAZyme) repertoire dominated by glycosyltransferases. In vitro, Wv2365 tolerated pH 3.0 and 0.3% bile, showed auto-aggregation, surface hydrophobicity, and 2,2-diphenyl-1-picrylhydrazyl (DPPH) and hydroxyl radical scavenging. The strain was susceptible to 10 antibiotics tested except for its intrinsic vancomycin non-susceptibility and was non-hemolytic and gelatinase negative. No acquired antimicrobial resistance or virulence genes were found in the genome. These findings indicate that W. viridescens Wv2365 is safe with probiotic traits relevant to gastrointestinal survival, colonization, and redox balance.

RevDate: 2026-01-09

Beavan AJS, Domingo-Sananes MR, JO McInerney (2026)

PanForest: predicting genes in genomes using random forests.

Bioinformatics (Oxford, England) pii:8418381 [Epub ahead of print].

MOTIVATION: The presence or absence of some genes in a genome can influence whether other genes are likely to be present or absent. Understanding these gene co-occurrence and avoidance patterns reveals fundamental principles of genome organisation, with applications ranging from evolutionary reconstruction to rational design of synthetic genomes.

IMPLEMENTATIONS: PanForest, presented here, uses random forest classifiers to predict the presence and absence of genes in genomes from the set of other genes present. Performance statistics output by PanForest reveal how predictable each gene's presence or absence is, based on the presence or absence of other genes in the genome. Further, PanForest produces statistics indicating the importance of each gene in predicting the presence or absence of each other gene. The PanForest software can run serially or in parallel, thereby facilitating the analysis of pangenomes at Network of Life scale.

RESULTS: A pangenome of 12,741 accessory genes in 1,000 Escherichia coli genomes was analysed in around 5 hours using 8 processors. To demonstrate PanForest's utility, we present a case study and show that certain genes associated with resistance to antimicrobial drugs reliably predict the presence or absence of other genes associated with resistance to the same drug. Further, we highlight several associations between those genes and others not known to be associated with antimicrobial resistance (AMR), or associated with resistance to other drugs. We envisage PanForest's use in studies from multiple disciplines concerning the dynamics of gene distributions in pangenomes ranging from biomedical science and synthetic biology to molecular ecology.

AVAILABILITY: The software if freely available with a full manual and can be found with at www.github.com/alanbeavan/PanForest DOI: https://doi.org/10.5281/zenodo.17865482.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

RevDate: 2026-01-09
CmpDate: 2026-01-09

Sarkar J (2025)

Core genome expansion in Brevibacterium across marine provinces reveals genomic footprint for long-term marine adaptation.

Iranian journal of microbiology, 17(6):912-928.

BACKGROUND AND OBJECTIVES: Actinobacteria are ubiquitous across diverse environmental niches. Brevibacterium strains within this phylum are widely distributed in both marine and terrestrial ecosystems worldwide. Marine environments are defined by distinct physicochemical properties-high salinity, alkaline pH, fluctuating O levels, and dynamic nutrient availability-which set them apart from terrestrial habitats. The broad ecological range of Brevibacterium strains raises questions about genome-encoded metabolic features that have evolved to adapt in marine environments.

MATERIALS AND METHODS: Genomics of Brevibacterium strains from various marine provinces was analyzed, focusing on core genome and pan-genome structure.

RESULTS: Core genome and pan-genome derived phylograms reveal a distinct polyphyletic origin of marine strains, as evidenced by their phylogenetic proximity despite diverse species affiliations. Only 1.16% of gene clusters from the total nonredundant gene repertoire were part of the core genome. Core genome size is shaped by geographical distribution. Notably, when strains from localized regions are analyzed, the core genome expands, indicating specialized functional requirements of additional genes within that environment. In marine isolates, the core genome includes genes involved in nutrient uptake, osmoregulation, and resistance to sediment genotoxicity. Additionally, a marine province-specific core genome analysis reveals genomic adaptations essential for acclimatization across different environments, regardless of species-level taxonomy.

CONCLUSION: Microbial genome evolution is shaped by ecological niche differentiation. The emergence and spread of habitats driven by tectonic plate movements may contribute to province-specific genomic divergence in Brevibacterium. This hypothesis merits further investigation, particularly as genomic data from deeper, geologically stable environments such as marine sediments become more accessible.

RevDate: 2026-01-09
CmpDate: 2026-01-09

Balogun IO, Mancuso CP, TD Lieberman (2025)

High Precision Binary Trait Association on Phylogenetic Trees.

bioRxiv : the preprint server for biology pii:2025.12.24.696407.

UNLABELLED: Traditional methods for identifying associations between genomic features and traits, or between pairs of genomic traits, struggle when applied to bacterial genomes. While several microbial GWAS (mGWAS) methods have been developed to account for the fact that genome-wide linkage in bacteria creates strong evolutionary-induced associations, these methods have high false discovery rates or lack statistical power, have poor performance on negative interactions, and face computational limits at the scale required for pangenome-wide study of gene-gene interactions. Here, we present SimPhyNI, a computationally optimized framework for efficient and rigorous mGWAS studies. SimPhyNI builds null co-occurrence distributions by independently simulating traits using phylogenetically-informed parameters, novelly including time to first event. The constrained variation in these simulations, combined with log odds ratio scoring for comparing across traits, robustly identifies both positive and negative associations. Using synthetic datasets mimicking both gene-gene and gene-trait associations, we demonstrate that SimPhyNI achieves high precision and recall for both positive and negative interactions. We demonstrate SimPhyNI's utility by detecting interactions between phage defense systems in E. coli and gene-gene interactions across the entire E. coli pangenome (>9 million tests). Though developed here for binary traits, SimPhyNI's design supports extension to multi-state and continuous traits using generalized models of stochastic simulation. SimPhyNI's performance and scalability enable genome-wide discovery of genetic interactions that drive microbial function, ecology, and disease.

DATA SUMMARY: SimPhyNI is publicly available at https://github.com/jpeyemi/SimPhyNI , and code for related benchmarking, validation, and biological analyses are available at doi.org/10.5061/dryad.9kd51c5xt. The neighbor-joining phylogenetic tree and phage defense system annotations used in this study were obtained from Wu et al. (2024). A representative set of Escherichia coli genomes and the corresponding maximum-likelihood phylogenetic tree were downloaded from the PanX database (https://pangenome.org/Escherichia_coli).

IMPACT STATEMENT: Understanding how bacterial genes associate with traits and with one another is essential for predicting disease outcomes, antibiotic resistance, and future evolution. However, identifying these interactions is challenging because shared ancestry creates false correlations. SimPhyNI overcomes this through an ancestry-informed statistical simulation process, achieving near-zero false positive rates while maintaining computational efficiency for large scale analyses. This efficiency enables systematic mapping of gene-gene interaction networks across large datasets containing thousands of genes and genomes. As microbial genomic datasets continue to expand, SimPhyNI's scalability and precision will accelerate discovery of the mechanistic principles underlying infectious disease, microbiome function, and microbial evolution and ecology.

RevDate: 2026-01-09
CmpDate: 2026-01-09

Fu Q, Xin Z, Miao B, et al (2026)

Leveraging Human Pangenome for Improved Somatic Variant Detection.

bioRxiv : the preprint server for biology pii:2026.01.04.697580.

Somatic variant detection is technically challenging due to low variant allele fractions, the confounding presence of germline variation, and reference bias. Linear references such as GRCh38 miss sample-specific variation, causing misalignments and incorrect variant calls. Although telomere-to-telomere donor-specific assemblies (DSAs) accurately represent individual genomes, their application is limited by cost and technical barriers. Alternatively, the graph-based human pangenome provides a scalable framework to improve read alignment and perform genome inference. Here, we benchmarked somatic variant detection using GRCh38, graph-based pangenomes, and pangenome-inferred DSAs with a HapMap mixture dataset and the COLO829 melanoma cell line. Pangenome-guided alignment improves read mapping and somatic variant calling accuracy. Furthermore, personalized pangenomes partially reconstruct donor-specific genomic content, improving accuracy, reducing germline contamination, and enabling detection of events in loci absent or poorly represented in GRCh38. These findings demonstrate that graph-based and personalized pangenomes are effective strategies for enhancing somatic variant detection compared with GRCh38.

RevDate: 2026-01-08
CmpDate: 2026-01-08

Dimonaco NJ (2026)

PyamilySeq: exposing the fragility of conventional gene (re)clustering and prokaryotic pangenomic inference methods.

NAR genomics and bioinformatics, 8(1):lqaf198.

Pangenomics has become a central framework for exploring microbial diversity and evolution, enabling researchers to distinguish genes that define shared biological function from those that drive adaptation. However, this relies on clustering genes by sequence similarity, a process that is far less deterministic than often assumed. This study introduces PyamilySeq, a transparent and flexible toolkit designed to diagnose and quantify hidden biases within gene clustering and pangenome inference methodologies. Using PyamilySeq, we can see how clustering thresholds (often hard-coded and poorly documented) and paralog handling can substantially alter gene family composition. Surprisingly, even parameters unrelated to clustering, such as decimal precision (0.8 versus 0.80), output selection, and even CPU and memory allocation, can alter gene family assignments, challenging the assumption that identical clustering thresholds yield consistent results. Furthermore, tools often fail to report biologically meaningful or representative sequences for gene families, undermining downstream analyses. These findings reveal systematic fragilities in gene clustering and pangenome construction and highlight that pangenomics is not merely a data-driven task but a methodological one, where transparency, reproducibility, and interpretability are as critical as biological insight. This work calls for a re-evaluation of how pangenomes are constructed and compared, and advocates for methodologies that make their assumptions explicit and their results verifiable.

RevDate: 2026-01-07

Sekar YS, Chellapandi P, Suresh KP, et al (2026)

Genomic adaptability and virulence of Bacillus anthracis: a machine learning-based pan-genome and comparative analysis.

BMC genomics pii:10.1186/s12864-025-12348-5 [Epub ahead of print].

RevDate: 2026-01-07

Lee CH, Bui TPN, Petitfils C, et al (2026)

Novel myo-inositol to butyrate fermentation pathway in the prevalent human gut species Dysosmobacter welbionis, a bacterium associated with improved metabolic and liver health.

Gut pii:gutjnl-2025-336617 [Epub ahead of print].

BACKGROUND: Dysosmobacter welbionis is a recently discovered butyrate producer whose presence in stool correlates with improved metabolic health. Whether its abundance is reduced in individuals with metabolic dysfunction-associated steatotic liver disease (MASLD) remains unknown. Mechanistic insight into its butyrate production from myo-inositol, a dietary compound from fruits, beans, grains and nuts with metabolic benefits, is also limited.

OBJECTIVE: To assess population-level distribution, relative abundance and strain diversity of D. welbionis in humans, and to elucidate its metabolic capacity to ferment myo-inositol into butyrate.

DESIGN: We analysed several human cohorts for associations with liver health and evaluated D. welbionis J115[T] supplementation in a diet-induced steatosis mouse model. An antibody-guided anaerobic cell-sorting strategy enabled isolation of distinct strains. We combined [13]C-labelled inositol isotopes with NMR, mass spectrometry, genomics and proteomics.

RESULTS: We found that D. welbionis and two related species (D. hominis and D. segnis) are prevalent gut bacteria in the human gut. D. welbionis abundance was reduced in MASLD across two cohorts and inversely correlated with fibrosis score in a third cohort. Treatment with D. welbionis J115[T] improved glycaemia and hepatic steatosis in high-fat diet fed mice. We identified a non-canonical myo-inositol-to-butyrate fermentation pathway. 19 human strains were isolated, comparative genomics of 23 strains revealed an open pangenome (about 2100 core genes) including the full myo-inositol fermentation pathway.

CONCLUSION: D. welbionis possesses a unique, conserved route to convert dietary myo-inositol into butyrate, distinguishing it from other commensals and supporting its potential as a next-generation probiotic for metabolic and liver health.

RevDate: 2026-01-07

Lucà S, Masillo F, Z Lipták (2025)

Measuring genomic data with prefix-free parsing.

Computational biology and chemistry, 122:108870 pii:S1476-9271(25)00534-1 [Epub ahead of print].

Prefix-free parsing (Boucher et al., 2019) is a highly effective heuristic for computing text indexes for very large amounts of biological data. The algorithm constructs a data structure, the prefix-free parse (PFP) of the input, consisting of a dictionary and a parse, which is then used to speed up computation of the final index. In this paper, we study the size of the PFP, which we refer to as π, and show that it is a powerful tool in its own right. To show this, we present two use cases. We first study the application of π as a repetitiveness measure of the input text, and compare it to other currently used repetitiveness measures, including z (the number of Lempel-Ziv phrases), r (the number of runs of the Burrows-Wheeler Transform), and δ (the text's substring complexity). We then turn to the use of π as a measure for pangenome openness. In both applications, our results are similar to existing measures, but our tool, in almost all cases, is more efficient than those computing the other measures, both in terms of time and space, sometimes by orders of magnitude. We close the paper with a detailed systematic study of the parameter choice for PFP (window size w and modulus p). This gives rise to interesting open questions. AVAILABILITY AND IMPLEMENTATION:: The source code is available at https://github.com/simolucaa/piPFP. The accession codes for all the datasets used and the raw results are available at https://github.com/simolucaa/piPFP_experiments.

RevDate: 2026-01-07
CmpDate: 2026-01-07

Zaccaron AZ, Lassagne A, Søndreli KL, et al (2026)

The poplar pathogen Sphaerulina musiva has a dynamic genome architecture marked by chromosomal inversions and changes in transposable element abundance.

Microbial genomics, 12(1):.

Fungal plant pathogens possess dynamic genomes, frequently shaped by transposable elements, that enable rapid adaptation to adverse conditions and host resistance mechanisms. However, assessing the adaptive significance of these genomic features remains challenging, in part due to the lack of high-quality genome assemblies for multiple members of a given species. To gain insights into genomic factors shaping pathogen evolution, we sequenced and assembled near-chromosome-scale genomes of 18 geographically diverse North American isolates of Sphaerulina musiva, a significant, important pathogen causing Septoria leaf spot and stem canker disease of poplar trees. Comparative genomic analyses indicated that all isolates possess 13 chromosomes with no evidence of accessory chromosomes. Transposable element (TE) content varied considerably among isolates (6.8 %-15.7 %), with a higher abundance in isolates from Oregon, British Columbia and Alberta, geographic regions outside the native range of S. musiva. The variation in TE content largely explained differences in genome size among isolates and suggested lineage-specific proliferation of TEs. Although a gene-based pangenome analysis indicated a relatively low percentage (9.5%) of accessory genes, this subset was enriched for candidate effectors. Our results indicate that S. musiva exhibits features of a 'one-speed genome' model. However, increased TE content is correlated with longer intergenic regions of candidate effector genes, suggesting that proliferation of TEs may be driving increased compartmentalization. Finally, synteny analysis revealed a total of 43 long chromosomal inversions with an average size of 293 kb that covered 34% of the S. musiva genome. These chromosomal inversions were more frequently observed in isolates from the pathogen's native range in the Eastern USA, and at least one inversion was predicted to affect the organization of a secondary metabolite gene cluster. These findings provide novel insights into the genome structure, TE dynamics and chromosomal rearrangements of the poplar pathogen S. musiva, offering a foundation for understanding its evolution and adaptation across diverse geographic regions and host species.

RevDate: 2026-01-07
CmpDate: 2026-01-07

Lu Y, Guo L, Wei Z, et al (2025)

Pan-genomic insights into LTP gene family evolution across diploid cotton species.

Frontiers in plant science, 16:1691339.

INTRODUCTION: Lipid-transfer proteins (LTPs) are a class of small, alkaline proteins that bind and transport various lipid molecules, including fatty acids, phospholipids, glycolipids, and steroids, between phospholipid bilayers. They play crucial roles in signal transduction, stress tolerance, and plant growth and development.

METHODS: In this study, based on pan-genomic data, we identified 107 LTP family members across nine diploid cotton species, comprising 45 core, 43 variable, and 19 specific genes. Synteny and selection pressure analyses clarified the evolutionary relationships among these genes, while structural variation analyses revealed that although structural variants altered gene structures, domains, and cis-acting elements, they did not significantly affect gene expression.

RESULTS: Expression profiling further demonstrated that LTP genes exhibited distinct spatiotemporal expression patterns in cotton ovules and roots at different developmental stages.

DISCUSSION: Overall, these findings highlight both conserved and divergent evolutionary patterns of the LTP family among diploid cotton species, providing new insights into their functional diversification, adaptive evolution, and potential involvement in cotton fiber development and stress responses.

RevDate: 2026-01-07
CmpDate: 2026-01-07

Depuydt L, Ahmed OY, Fostier J, et al (2025)

Run-length compressed metagenomic read classification with SMEM-finding and tagging.

iScience, 28(12):114029.

Metagenomic read classification is a fundamental task in computational biology but remains challenging due to the scale and diversity of sequencing data. We present a run-length compressed BWT-based index using the move structure for efficient multi-class classification. Our method finds all super-maximal exact matches (SMEMs) of length ≥ L between a read and a reference and associates each SMEM with one class identifier using a sampled tag array. A consensus algorithm then compacts these SMEMs and their class identifiers into a single classification. We are the first to perform run-length compressed read classification using full rather than semi-SMEMs. We evaluated on long and short reads across two datasets: a large bacterial pan-genome with few classes and a smaller 16S rRNA gene database spanning thousands of genera. Our method outperforms SPUMONI 2 in accuracy and runtime while maintaining run-length compressed memory complexity and surpasses Cliffy in memory efficiency with comparable accuracy.

RevDate: 2026-01-06

Ma HY, Nie S, Liu HB, et al (2026)

A pangenome insight into the genome divergence and flower color diversity among Rhododendron species.

BMC genomics pii:10.1186/s12864-025-12461-5 [Epub ahead of print].

BACKGROUND: The Rhododendron genus (Rhododendron L.), recognized as the most extensive woody plant genus in the Northern Hemisphere, captivates with its strikingly beautiful corollas and variety of flower colors. In addition, the Rhododendron genus exhibits a complex evolutionary history and substantial species diversification. To comprehensively understand the genomic complexity and flower color diversity within this genus, comparative genomics has emerged as a promising approach, enabling analysis at a super-species level.

RESULTS: Here, we collected whole-genome data from seven rhododendrons of different subgenera to investigate the patterns of interspecific genomic and sequence divergence, as well as evolutionary dynamics of gene family related to flower color. We discovered that approximately 50% of Rhododendron genomes are composed of transposable elements (TEs), with over half of them being long terminal repeat retrotransposons (LTR-RTs). TEs significantly associate with genomic differentiation and structural variances within the genus. Additionally, the duplication and loss of genes associated with flower color and their corresponding expression over time are potentially driven by TEs.

CONCLUSION: Our comparative genomic analysis accentuates the critical role of TEs in genome divergence within the Rhododendron genus, highlighting their potential role as a key factor governing speciation and interspecific variability within the genus.

RevDate: 2026-01-05
CmpDate: 2026-01-05

Kim SK, Cho YJ, Hovde CJ, et al (2025)

Comparative genome analysis of enterohemorrhagic Escherichia coli ATCC 43894 and its pO157-cured strain 277.

Journal of microbiology (Seoul, Korea), 63(12):e2511015.

Enterohemorrhagic Escherichia coli (EHEC) O157:H7 ATCC 43894 (also known as EDL932) has been widely used as a reference strain for studying the pathophysiology of EHEC. To elucidate the role of a large virulence plasmid pO157 and its relationship with acid resistance, for example, both EHEC ATCC 43894 and its pO157-cured derivative strain 277 were well studied. However, it is unclear whether or not these two strains are isogenic and share the same genetic background. To address this question, we analyzed the whole genome sequences of ATCC 43894 and 277. As expected, three and two closed contigs were identified from ATCC 43894 and 277, respectively; two contigs shared in both strains were a chromosome and a small un-identified plasmid, and one contig found only in ATCC 43894 was pO157. Surprisingly, our pan-genome analyses of the two sequences revealed several genetic variations including frameshift, substitution, and deletion mutations. In particular, the deletion mutation of hdeD and gadE in ATCC 43894 was identified, and further PCR analysis also confirmed their deletion of a 2.5-kb fragment harboring hdeD, gadE, and mdtE in ATCC 43894. Taken together, our findings demonstrate that EHEC ATCC 43894 harbors genetic mutations affecting glutamate-dependent acid resistance system and imply that the pO157-cured EHEC 277 may not be isogenic to ATCC 43894. This is the first report that such genetic differences between both reference strains of EHEC should be considered in future studies on pathogenic E. coli.

RevDate: 2026-01-02

Yang Z, Yang Z, Gao C, et al (2026)

Graph pan-genome illuminates evolutionary trajectories and agronomic trait architecture in allotetraploid cotton.

Nature genetics [Epub ahead of print].

Upland cotton (Gossypium hirsutum), one of the world's major fiber crops, faces challenges from the genetic homogeneity of modern varieties. Here we present 107 gold-standard genome assemblies spanning the wild-to-domesticated continuum, revealing six large-scale structural variations, including a chromosomal reciprocal translocation and five inversions tracing the evolutionary history of cultivated cotton in the Americas. This history also involved continuous introgression from Gossypium barbadense, shaping the genetic diversity of G. hirsutum landraces and cultivars. Leveraging the graph pan-genome, we capture the sequence and structural diversity of nucleotide-binding site-leucine-rich repeat genes, uncovering pathogen-driven selection signatures and loci associated with disease resistance. A presence-absence variation genome-wide association study (GWAS) identified previously overlooked loci for key fiber traits, complementing single-nucleotide polymorphism-GWAS findings. Additionally, we construct a detailed map of large inversions, offering insights into hybridization dynamics and strategies to mitigate linkage drag. This study enhances our understanding of cotton evolution and domestication while delivering a valuable resource to enhance breeding.

RevDate: 2026-01-02
CmpDate: 2026-01-02

Oladipo PM, Jomaa AM, Withey JH, et al (2025)

Genome organization, virulence genes, and temperature-dependent motility of an emerging pathogen, Escherichia marmotae.

Frontiers in microbiology, 16:1729604.

INTRODUCTION: Escherichia marmotae is one of the Escherichia cryptic clades that were first isolated from animal feces and environmental waters and has recently emerged as an organism of concern due to its presence in human infections. Although E. marmotae cannot be distinguished from E. coli by standard clinical tests, its 10% pairwise genomic difference from E. coli led us to investigate other phenotypic differences that may be present.

METHODS: Bioinformatic software was used to identify the E. marmotae pan-genome, antimicrobial and virulence genes, and sequences of genes for motility, biofilm formation, and other phenotypic characteristics. Environmental and clinical isolates of E. marmotae were analyzed for antimicrobial sensitivity, and for temperature effects on motility, growth, and biofilm formation, in comparison to E. coli. RT-PCR analyzed associated changes in gene expression.

RESULTS: The E. marmotae genome consists of >75% core genes, and has many accessory genes, including plasmids and antimicrobial resistance genes. E. marmotae is resistant to erythromycin. E. marmotae had all genes needed for complete flagellar gene assembly, and phenotypically was motile at 28°C, and much less motile at 37°C. More biofilm formation was observed at 28°C than at 37°C. The expression of motility genes motA and fliA decreased at 37°C in E. marmotae compared to E. coli.

CONCLUSION: These temperature-sensitive traits may support environmental persistence and adaptations that may facilitate E. marmotae to cause human disease.

RevDate: 2026-01-01
CmpDate: 2026-01-01

Xu C, Xu W, Yuan Y, et al (2025)

Global Stress Responses Identify the Functionally Divergent Regulators Required for Candida auris Commensalism and Pathogenicity.

Exploration (Beijing, China), 5(6):20240482.

Given its global distribution and high transmissibility in the environment, Candida auris poses a serious threat to global public health. However, the underlying mechanisms of its adaptive strategies remain poorly understood. Here we delineate the pan-genome structures of 1,306 representative C. auris isolates collected from 28 countries. In addition to the clade-related genetic diversity and highly variable pan-genomes, we identify the key regulatory modules and genes specific to C. auris in response to 32 different host microenvironment-mimicking stresses. Through comparative analysis with evolutionarily close fungal relatives, we uncover both shared and species-specific transcriptional responses in C. auris. Intriguingly, our results reveal a distinct pathogenic role for the conserved iron regulon in this species. Unexpectedly, we also identify an evolutionarily divergent functional role for RIM101 in regulating both pathogenicity and commensalism of C. auris. Mechanistically, the high-affinity glucose transporters were found to enhance the tolerance to alkaline stress through alleviation of RIM101-dependent glucose repression in the host microenvironment. These findings provide mechanistic insights into the evolutionarily divergent adaptive strategies in both commensalism and virulence of the emerging critical priority fungal pathogen, C. auris.

RevDate: 2025-12-31

Obregon-Gutierrez P, Nogales J, Gonzalez-Torres C, et al (2025)

Exploring the pangenome of Mycoplasma hyorhinis in search of potential virulence markers.

Scientific reports pii:10.1038/s41598-025-31942-x [Epub ahead of print].

Mycoplasma hyorhinis (homotypic synonym of Mesomycoplasma hyorhinis) is a pathobiont from the upper respiratory tract of pigs. Under unclear circumstances, it can disseminate systemically and cause disease. Although some studies have reported different infectious capabilities among strains, no factors have been directly linked to virulence. This study aimed to analyze the core and accessory genes of all available M. hyorhinis strains (pangenome) to identify potential virulence markers. We characterized the pangenome of 110 strains, including isolates from healthy (nasal cavity) and diseased (systemic organs, nasal cavity or lung) animals. Comparative analyses were performed according to the clinical background. Although most putative virulence genes were shared, we identified several genes absent in most health-associated strains related to DNA-processing mechanisms, including hsdM-hsdR restriction-modification system and various helicases. Furthermore, the particular analysis of variable lipoprotein (vlp) genes revealed a similar presence in all strains but higher number of repeats in region III of vlpF and in strains isolated from systemic lesions. Genome-scale metabolic models were used to infer the metabolic capabilities of the strains, showing highly conserved predicted reactomes, including growth capabilities and auxotrophies. In conclusion, although all strains may carry genes enabling disease, nasal strains from healthy animals lacked some DNA-processing genes and showed distinct vlp patterns. Additional genomes, especially from strains isolated from healthy animals, would be needed to confirm these findings.

RevDate: 2025-12-31
CmpDate: 2025-12-31

Gomes GC, Sousa EG, Quaresma LS, et al (2025)

Comparative and functional analyses of Bacillus paralicheniformis strains BAC30 and BAC220 by WGS uncover species homogeneity and biotechnological potential.

World journal of microbiology & biotechnology, 42(1):20.

The Bacillus genus includes plant growth-promoting rhizobacteria (PGPR), and the discovery of new strains within this group is of great biotechnological interest due to their ability to produce antimicrobial compounds (AMCs), vitamins, enzymes, and heterologous proteins. Among these, Bacillus paralicheniformis is a recently described species whose phylogeny remains poorly resolved, highlighting the need for further investigation. This study aimed to identify and characterize the isolates BAC30 and BAC220 using whole-genome sequencing (WGS). Both were confirmed as B. paralicheniformis and included in phylogenomic and comparative analyses with 28 other strains to assess the species' genetic structure and inter-strain similarity. Functional annotation of BAC30 and BAC220 was also performed, focusing on biotechnological potential. Comparative analysis revealed high genomic similarity among strains, including the two isolates. Pangenome analysis showed a low proportion of core genes relative to accessory genes (shell and cloud), and the rarefaction curve suggested an open pangenome, indicating the species' ubiquity and co-evolution with other organisms. Functional analysis identified genes of defense mechanisms related to beta-lactam resistance. Regarding secondary metabolite production, genes involved in the biosynthesis of vitamins (e.g., riboflavin) and AMCs (e.g., bacitracin) were detected. Although further in vitro and in vivo assays are needed to confirm gene expression, the findings support the biotechnological relevance of these isolates as potential biocontrol agents and/or producers of industrially valuable compounds.

RevDate: 2025-12-31
CmpDate: 2025-12-31

Barigelli S, Koper P, Petricciuolo M, et al (2025)

Unravelling the Genomic and Virulence Diversity of Legionella pneumophila Strains Isolated from Anthropogenic Water Systems.

Microorganisms, 13(12):.

Legionella pneumophila, a waterborne pathogen naturally present in freshwater and capable of colonizing artificial water systems, is responsible for Legionnaires' disease (LD), a severe form of pneumonia transmitted through inhalation of contaminated aerosols. Virulence of Legionella strains is affected by the plasticity of their genome, shaped by horizontal gene transfer and recombination events. Thus, contaminated water systems can host diverse Legionella populations with a distinct virulence potential. Here, we compare the genomic diversity of Legionella pneumophila strains isolated in water systems of academic buildings, together with their cytotoxicity and intracellular replication in THP-1-like macrophages. A six-year environmental surveillance revealed Legionella pneumophila contamination in 20 out of the 50 monitored sites, identifying five serogroups (sg) and 13 Sequence Types (STs). Phylogenetic investigations based on core genome multilocus sequence typing (cgMLST) and comparative genomics of representative isolates of each ST showed a broad diversity and a heterogeneous virulence repertoire, especially within the Dot/Icm and Lvh secretion systems. Following macrophage infection, a strain-dependent cytotoxicity and intracellular replication was observed, underlying significant pathogenic diversity within the same species and stage-dependent infection dynamics. Together, these results showed strain-specific genetic and phenotypic virulence traits to be considered during risk assessment in environmental surveillance.

RevDate: 2025-12-31
CmpDate: 2025-12-31

Liang Y, Wang W, Guo Y, et al (2025)

Study on Genomic Diversity, Prophage Distribution of Bovine-Derived Staphylococcus aureus and Their Association with Antimicrobial Resistance.

Microorganisms, 13(12):.

Staphylococcus aureus is the core pathogen causing bovine mastitis, and its antimicrobial resistance evolution is closely linked to prophage-mediated genetic material transfer, but their systematic association remains unclear. This study focused on 101 bovine-derived S. aureus strains isolated from large-scale dairy farms in Shihezi, Xinjiang, from September 2024 to January 2025, to explore their genomic diversity, prophage distribution characteristics, and intrinsic links to resistance. Results showed that the strains had resistance rates of 0.00-80.20% to 18 antibiotics across 12 classes, with ceftiofur having the highest resistance rate (80.20%) and 10 antibiotics including amoxicillin showing 0.00% resistance. Multidrug-resistant (MDR) strains accounted for 9.9% (10 strains), among which 2 had a resistance spectrum covering 7 antibiotic classes. The average genome size was 2.57 Mb with a GC content of 33.44%, cloud genes accounted for 85.00% of the pan-genome, and MLST identified 14 ST types, with ST5404 as the dominant type (36.6%). A total of 398 prophages were detected: 82.18% of strains carried resistance genes via prophages (Type I), while this proportion was 50.00% in MDR strains (Type II). This study confirms that prophages synergize with the ST5404 clonal group to promote clustered resistance gene transmission, providing a scientific basis for regional control of mastitis-causing drug-resistant strains and precise drug use.

RevDate: 2025-12-31
CmpDate: 2025-12-31

Amirgazin A, Yessembekova G, Akhmetova A, et al (2025)

Genomic Insights into Pasteurella multocida Serotype B:2 from Hemorrhagic Septicemia Outbreaks in Wildlife and Livestock in Kazakhstan.

Pathogens (Basel, Switzerland), 14(12): pii:pathogens14121273.

Outbreaks of hemorrhagic septicemia (HS) caused by Pasteurella multocida serogroup B are endemic in Kazakhstan. These outbreaks have repeatedly led to mass mortality events among wild saigas and economic losses to farms. The aim of this study was to conduct the first whole-genome sequencing (WGS) and analysis of P. multocida genomes associated with HS cases in saigas and livestock in Kazakhstan. In this study, WGS was performed on 22 P. multocida isolates obtained from saigas and livestock. A comparative genomic analysis of P. multocida isolates from Kazakhstan and publicly available genomes was performed. All isolates belonged to the B:2:ST122 genotype and formed distinct phylogenetic clusters based on outbreaks in saiga populations and livestock. Clustering also corresponded to identified mutations in virulence genes. Isolates recovered from the 2015 mass mortality of saigas in the Betpak-Dala population were found to have a deletion of the flp1 gene. This observation emphasizes the study of the role of Flp pili in HS pathogenesis. Comparison of the P. multocida B:L2:ST122 genomes revealed low virulence gene diversity and an open pangenome. Prophage annotation did not identify virulence or pathogenicity genes. The obtained results will be useful for future studies of HS pathogenesis.

RevDate: 2025-12-31
CmpDate: 2025-12-31

Bhowmik S, Rivu S, Bari ML, et al (2025)

Genome Mining of Cronobacter sakazakii in Bangladesh Reveals the Occurrence of High-Risk ST83 and Rare ST789 Lineages.

Pathogens (Basel, Switzerland), 14(12): pii:pathogens14121220.

Cronobacter sakazakii is a foodborne pathogen of major concern due to its link with severe neonatal infections through powdered infant formula (PIF). However, its genomic epidemiology in Bangladesh remains uncharacterized. We report the first whole-genome analysis of three isolates from PIF. Two isolates (S41_PIFM and S44_RUTF) belonged to ST83, a lineage repeatedly associated with neonatal meningitis, septicemia, and persistence in PIF production environments, while the third (S43_TF) represented ST789, a recently described and rare lineage of unknown pathogenic potential. Pan-genome and comparative analyses identified 39 virulence determinants, 19 antimicrobial-resistance genes, and diverse mobile genetic elements. ST83 isolates harbored plasmid replicons IncFII(pCTU2) and pESA2, while the ST789 isolate carried insertion sequence ISKpn34, indicating horizontal gene transfer potential. All strains encoded I-E CRISPR-Cas systems. The detection of globally recognized high-risk ST83 clones alongside the novel ST789 lineage highlights emerging public health risks. This study provides the first genomic insights into C. sakazakii in Bangladesh and underscores the urgent need for genomic surveillance and strengthened food safety monitoring to protect infant health in low- and middle-income countries.

RevDate: 2025-12-31
CmpDate: 2025-12-31

Li SN, Li YL, Sun MH, et al (2025)

Pangenome-Wide Identification, Evolutionary Analysis of Maize ZmPLD Gene Family, and Functional Validation of ZmPLD15 in Cold Stress Tolerance.

Plants (Basel, Switzerland), 14(24):.

Phospholipase D (PLD) genes play key roles in plant abiotic stress responses, but the systematic identification of the maize (Zea mays) PLD family and its cold tolerance mechanism remain unclear. Using 26 maize genomes (pangenome), we identified 21 ZmPLD members via Hidden Markov Model (HMM) search (Pfam domain PF00614), including five private genes-avoiding gene omission from single reference genomes. Phylogenetic analysis showed ZmPLD conservation with Arabidopsis and rice PLDs; Ka/Ks analysis revealed most ZmPLDs under purifying selection, while three genes (including ZmPLD15) had positive selection signals, suggesting roles in maize adaptive domestication. For ZmPLD15, five shared structural variations (SVs) were found in its promoter; some contained ERF/bHLH binding sites, and SVs in Region1/5 significantly regulated ZmPLD15 expression. Protein structure prediction and molecular docking showed conserved ZmPLD15 structure and substrate (1,2-diacyl-sn-glycero-3-phosphocholine) binding energy across germplasms. Transgenic maize (B73 background) overexpressing ZmPLD15 was generated. Cold stress (8-10 °C, 6 h) and recovery (24 h) on three-leaf seedlings showed transgenic plants had better leaf cell integrity than wild type (WT). Transgenic plants retained 45.8% net photosynthetic rate (Pn), 47.9% stomatal conductance (Gs), and 55.8% transpiration rate (Tr) versus 7.6%, 21.3%, 13.8% in WT; intercellular CO2 concentration (Ci) was maintained properly. This confirms ZmPLD15 enhances maize cold tolerance by protecting photosynthetic systems, providing a framework for ZmPLD research and a key gene for cold-tolerant maize breeding.

RevDate: 2025-12-31

Cumer T, Milia S, Leonard AS, et al (2025)

PG-SCUnK: measuring pangenome graph representativeness using single-copy and universal K-mers.

BMC bioinformatics pii:10.1186/s12859-025-06355-2 [Epub ahead of print].

BACKGROUND: Pangenome graphs integrate multiple assemblies to represent non-redundant genetic diversity. However, current evaluations of pangenome graphs rely primarily on technical parameters (e.g., total length, number of nodes/edges, growth curves), which fail to assess how effectively the graph represents homologous stretches across the integrated assemblies and how well short reads align against pangenome graph references.

RESULTS: We introduce a novel method to quantitatively assess how well a pangenome graph represents its integrated assemblies. Our method quantifies how many single-copy and universal k-mers from the source assemblies are uniquely and completely represented within the graph nodes. Implemented in the open-source tool PG-SCUnK, this approach identifies the fractions of unique, duplicated, and split k-mers, which correlate with short read mapping rates to the pangenome graph.

CONCLUSIONS: Insights provided by PG-SCUnK facilitate the selection of appropriate parameters to build optimal reference pangenome graphs.

LOAD NEXT 100 CITATIONS

ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
961 Red Tail Lane
Bellingham, WA 98226

E-mail: RJR8222 @ gmail.com

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin and even a collection of poetry — Chicago Poems by Carl Sandburg.

Timelines

ESP now offers a large collection of user-selected side-by-side timelines (e.g., all science vs. all other categories, or arts and culture vs. world history), designed to provide a comparative context for appreciating world events.

Biographies

Biographical information about many key scientists (e.g., Walter Sutton).

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 28 JUL 2024 )