Viewport Size Code:
Login | Create New Account
picture

  MENU

About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot

icon

Bibliography Options Menu

icon
QUERY RUN:
HITS:
PAGE OPTIONS:
Hide Abstracts   |   Hide Additional Links
NOTE:
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.

More About:  ESP | OUR CONTENT | THIS WEBSITE | WHAT'S NEW | WHAT'S HOT

ESP: PubMed Auto Bibliography 07 Sep 2025 at 01:32 Created: 

Pangenome

Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: ( pangenome OR "pan-genome" OR "pan genome" ) NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

-->

RevDate: 2025-09-06

Soleymani F, Correa SM, Arend M, et al (2025)

Constraint-based metabolic modeling reveals metabolic properties underpinning the unprecedented growth of Chlorella ohadii.

The New phytologist [Epub ahead of print].

Comparative molecular and physiological analyses of organisms from one taxonomic group grown under similar conditions offer a strategy to identify gene targets for trait improvement. While this strategy can also be performed in silico using genome-scale metabolic models for the compared organisms, we continue to lack solutions for the de novo generation of such models, particularly for eukaryotes. To facilitate model-driven identification of gene targets for growth improvement in green algae, here we present a semiautomated platform for de novo generation of genome-scale algal metabolic models. We deployed this platform to reconstruct an enzyme-constrained, genome-scale metabolic model of Chlorella ohadii, the fastest growing green alga reported to date, and validated the growth predictions in experiments under three growth conditions. We also proposed a computational strategy to identify targets for growth improvement based on flux analyses. Extensive flux-based comparative analyses using all existing models of green algae resulted in the identification of potential targets for growth improvement not only in standard but also in extreme light conditions, where C. ohadii still exhibits exceptional growth. Our findings indicate that the developed platform provides the basis for the generation of pan-genome-scale metabolic models of algae.

RevDate: 2025-09-05

Chandola U, Manirakiza E, Maillard M, et al (2025)

A Bradyrhizobium isolate from a marine diatom induces nitrogen-fixing nodules in a terrestrial legume.

Nature microbiology [Epub ahead of print].

Biological nitrogen fixation converts atmospheric nitrogen into ammonia, essential to the global nitrogen cycle. While cyanobacterial diazotrophs are well characterized, recent studies have revealed a broad distribution of non-cyanobacterial diazotrophs (NCDs) in marine environments, although their study is limited by poor cultivability. Here we report a previously uncharacterized Bradyrhizobium isolated from the marine diatom Phaeodactylum tricornutum. Phylogenomic analysis places the strain within photosynthetic Bradyrhizobium, suggesting evolutionary adaptations to marine and terrestrial niches. Average nucleotide identity supports its classification as a previously undescribed species. Remarkably, inoculation experiments showed that the isolate induced nitrogen-fixing nodules in the Aeschynomene indica legume, pointing to symbiotic capabilities across ecological boundaries. Pangenome analysis and metabolic predictions indicate that this isolate shares more features with terrestrial photosynthetic Bradyrhizobium than with marine NCDs. Overall, these findings suggest that symbiotic interactions could evolve across different ecological niches, and raise questions about the evolution of nitrogen fixation and microbe-host interactions.

RevDate: 2025-09-05
CmpDate: 2025-09-05

Behruznia M, Marin M, Whiley DJ, et al (2025)

The Mycobacterium tuberculosis complex pangenome is small and shaped by sub-lineage-specific regions of difference.

eLife, 13:.

The Mycobacterium tuberculosis complex (MTBC) is a group of bacteria causing tuberculosis (TB) in humans and animals. Understanding MTBC genetic diversity is crucial for insights into its adaptation and traits related to survival, virulence, and antibiotic resistance. While it is known that within-MTBC diversity is characterised by large deletions found only in certain lineages (regions of difference [RDs]), a comprehensive pangenomic analysis incorporating both coding and non-coding regions remains unexplored. We utilised a curated dataset representing various MTBC genomes, including under-represented lineages, to quantify the full diversity of the MTBC pangenome. The MTBC was found to have a small, closed pangenome with distinct genomic features and RDs both between lineages (as previously known) and between sub-lineages. The accessory genome was identified to be a product of genome reduction, showing both divergent and convergent deletions. This variation has implications for traits like virulence, drug resistance, and metabolism. The study provides a comprehensive understanding of the MTBC pangenome, highlighting the importance of genome reduction in its evolution, and underlines the significance of genomic variations in determining the pathogenic traits of different MTBC lineages.

RevDate: 2025-09-05

Stepanauskas R, Brown JM, Arasti S, et al (2025)

Net rate of lateral gene transfer in marine prokaryoplankton.

The ISME journal pii:8248340 [Epub ahead of print].

Lateral gene transfer is a major evolutionary process in Bacteria and Archaea. Despite its importance, lateral gene transfer quantification in nature using traditional phylogenetic methods has been hampered by the rarity of most genes within the enormous microbial pangenomes. Here, we estimated lateral gene transfer rates within the epipelagic tropical and subtropical ocean using a global, randomized collection of single amplified genomes and a non-phylogenetic computational approach. By comparing the fraction of shared genes between pairs of genomes against a lateral gene transfer-free model, we show that an average cell line laterally acquires and retains ~13% of its genes every 1 million years. This translates to a net lateral gene transfer rate of ~250 genes L-1 seawater day-1 and involves both "flexible" and "core" genes. Our study indicates that whereas most genes are exchanged among closely related cells, the range of lateral gene transfer exceeds the contemporary definition of bacterial species, thus providing prokaryoplankton with extensive genetic resources for lateral gene transfer-based adaptation to environmental stressors. This offers an important starting point for the quantitative analysis of lateral gene transfer in natural settings and its incorporation into evolutionary and ecosystem studies and modeling.

RevDate: 2025-09-04

Long GS, Singh N, Patel S, et al (2025)

Integrated genomic approaches improve Treponema pallidum phylogenetics and lineage classification.

Canadian journal of microbiology [Epub ahead of print].

Syphilis cases have been consistently rising since its near elimination in the late 1990s. This resurgence, along with increasing rates of macrolide resistance and congenital syphilis, has triggered renewed efforts to better understand and control the disease. We analyzed 827 T. pallidum genomes and created a new genome-based hierarchical lineage framework, recapitulating the major T. pallidum lineages and characterizing sub-lineages. An updated pangenome was constructed, revealing that T. pallidum subsp. pallidum lineages are determined by a single hypothetical major outer sheath C-terminal domain-containing gene while no significant genetic difference was observed between T. pallidum subsp. pertenue and T. pallidum subsp. endemicum. This study introduces an integrated genomic approach to characterize T. pallidum and highlights the significance of pangenomes in supporting public health.

RevDate: 2025-09-04
CmpDate: 2025-09-04

Li H (2025)

Finding easy regions for short-read variant calling from pangenome data.

GigaScience, 14:.

BACKGROUND: While benchmarks on short-read variant calling suggest a low error rate below 0.5%, they are only applicable to predefined confident regions. For a human sample without such regions, the error rate could be 10 times higher. Although multiple sets of easy regions have been identified to alleviate the issue, they fail to consider nonreference samples or are biased toward existing short-read data or aligners.

RESULTS: Here, using hundreds of high-quality human assemblies, we derived a set of sample-agnostic easy regions where short-read variant calling reaches high accuracy. These regions cover 88.2% of GRCh38, 92.2% of coding regions, and 96.3% of ClinVar pathogenic variants. They achieve a good balance between coverage and easiness and can be generated for other human assemblies or species with multiple well-assembled genomes.

CONCLUSIONS: This resource provides a convenient and powerful way to filter spurious variant calls for clinical or research human samples.

RevDate: 2025-09-04
CmpDate: 2025-09-04

Kupczok A, Gavriilidou A, Paulitz E, et al (2025)

Gene co-occurrence and its association with phage infectivity in bacterial pangenomes.

Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 380(1934):20240070.

Phages infect bacteria and have recently re-emerged as a promising strategy to combat bacterial infections. However, there is a lack of methods to predict whether and why a particular phage can or cannot infect a bacterial strain based on their genome sequences. Understanding the complex interactions between phages and their bacterial hosts is thus of considerable interest. We recently developed Goldfinder, a phylogenetic method to discover gene co-occurrences across bacterial pangenomes. Here, we expand Goldfinder to infer which gene presences or absences influence bacterial sensitivity to phages. By integrating a bacterial pangenome with an experimentally determined host range matrix, we infer associations between phage infectivity and the presence of accessory genes in bacterial pangenomes. The presented approach can be applied to predict bacterial genes that potentially enable phage infection, bacterial genes that prevent phage infection, and potential interactions between particular bacterial and phage accessory genes. Finally, the predicted interactions are clustered and visualized with the software Cytoscape. Here, we present a method to identify candidate genes within the pool of mobile accessory genes that may contribute to phage-host interactions. This approach will help to set up follow-up experiments and to understand the complex interactions between phages and bacteria.This article is part of the discussion meeting issue 'The ecology and evolution of bacterial immune systems'.

RevDate: 2025-09-01

Steensma MJ, Ducro BJ, Dibbits B, et al (2025)

High-quality, haplotype-resolved reference genomes of the Dutch warmblood horse and Friesian horse using trio binning.

BMC genomics, 26(1):790.

BACKGROUND: In horses, genetic diversity is predominantly observed between breeds, with little variation within breeds. The studbooks of the two largest horse populations in the Netherlands, the Dutch Warmblood horse and Friesian horse population, have ongoing conservation projects including collecting large-scale genotype and sequence data. The current reference genome, derived from a Thoroughbred horse can lead to bias in genetic analyses of other horse breeds. Therefore, the aim of this study was to create high-quality breed-specific reference genomes of Dutch Warmblood and Friesian horses.

RESULTS: We performed nanopore long-read sequencing (R10.4, Q20+) of an F1 cross between a Dutch Warmblood horse and a Friesian horse to create two breed-specific reference genomes by trio binning. This resulted in high-quality, haplotype-resolved reference genomes with contig N50 of 37 and 35 Mb and single copy gene completeness of 99.2 and 99.3% for the Friesian and Warmblood, respectively. The majority of the chromosomes contained telomeric and /or centromeric sequences. The Ensembl gene annotation resulted in 19,750 and 19,872 protein coding genes for the Friesian and Warmblood, respectively. No large chromosomal rearrangements were observed between the Friesian and Warmblood genomes. However, a total of 722 large structural variations (> 10 kb) were identified, of which 14 affect the coding sequence of protein-coding genes.

CONCLUSION: The novel breed-specific reference genomes provide a valuable resource for future genetic analysis and breed conservation efforts and will contribute to ongoing equine pangenome efforts.

SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12864-025-11985-0.

RevDate: 2025-09-02

Kolesch F, Sohn M, Rempel A, et al (2025)

SANS ambages: phylogenomics with abundance-filter, multi-threading, and bootstrapping on amino-acid or genomic sequences.

BMC bioinformatics, 26(1):227.

BACKGROUND: The increasing amount of available genome sequence data enables large-scale comparative studies. A common task is the inference of phylogenies- a challenging task if close reference sequences are not available, genome sequences are incompletely assembled, or the high number of genomes precludes multiple sequence alignment in reasonable time. SANS is an alignment-free, whole-genome based approach for phylogeny estimation.

RESULTS: Here we present a new implementation SANS ambages with a significantly increased application spectrum. It offers additional types of input data, parallelized processing, and bootstrapping. The source code (C++), documentation, and example data are freely available for download at: https://github.com/gi-bielefeld/sans . SANS can also be launched via the web-interface of the CloWM platform- free of charge, with a standard Life Science account: https://clowm.bi.denbi.de/workflows/0194b78f-9696-7402-a2b8-858508733618/ .

CONCLUSIONS: The new version not only shortens processing time on large datasets immensely by parallelization. Being able to also process amino acid sequences and offering a filter for low-abundant DNA read segments also enables new application cases. Bootstrapping and integrated visualization ease and enrich the interpretation of the resulting phylogenies.

RevDate: 2025-09-02

Du M, Zhang F, Wang X, et al (2025)

Structural and deleterious burdens and their effects on yield traits in foxtail millet domestication.

iScience, 28(9):113295 pii:S2589-0042(25)01556-1.

Crop domestication typically accumulates structural and deleterious variants through genetic bottlenecks and selection hitchhiking. However, the structural and deleterious variant burden has not been investigated in the foxtail millet (Setaria italica). Integrating comparative genomics, pangenomics, population genetics, and quantitative genetics, we identified 6,713 gene gains and 2,802 losses during domestication, affecting flowering time and developmental processes. Population genetics of 333 wild and cultivated accessions revealed 25.76% and 40.40% reductions in structural and deleterious variant burdens in cultivars, potentially reflecting a dramatic loss of genetic diversity of the wild progenitor. Quantitative genetics detected genetic association of yield traits, and essential roles of deleterious and structural variants in the formation of yield traits. In general, this study highlights significant impacts of structural and deleterious variants on yield traits and provides valuable guidelines for molecular breeding of foxtail millet.

RevDate: 2025-09-02

Samano A, Musat M, Junaghare M, et al (2025)

Structural variants are enriched in deleterious visible phenotypes in Drosophila.

bioRxiv : the preprint server for biology pii:2025.08.15.670616.

Genome structural variants (SVs) comprise a sizable portion of functionally important genetic variation in all organisms; yet, many SVs evade discovery using short reads. While long-read sequencing can find the hidden SVs, the role of SVs in variation in organismal traits remains largely unclear. To address this gap, we investigate the molecular basis of 50 classical phenotypes in 11 Drosophila melanogaster strains using highly contiguous de novo genome assemblies generated with Oxford Nanopore long reads. These assemblies enabled the creation of a pangenome graph containing comprehensive, nucleotide-resolution maps of SVs, including complex rearrangements such as the interchromosomal inverted duplication Dp(2;4)eyD and large tandem duplications at the Bar locus. We uncovered new candidate causal mutations for 15 phenotypes and new molecular alleles for 2 mutations comprising tandem duplications, transposable element (TE) insertions, and indels. For example, we mapped the tarsal joint defect Ablp [eyD] to an 8 kb Roo retrotransposon insertion into an intergenic enhancer, a finding validated via CRISPR-Cas9. The wing vein phenotype plexus (px [1]) was linked to a 1.5 kb partial tandem gene duplication, and the century-old Curved (c [1]) wing phenotype was linked to a 7.5 kb DM412 retrotransposon inserted into the coding sequence of the muscle protein gene Strn-Mlck . We also unveiled 8 SV alleles of previously identified causal genes, including previously uncharacterized SVs underlying the extensively studied white and yellow phenotypes. Overall, 67.4% of the genes causing phenotypic changes harbored candidate SVs over 100 bp, whereas only 28% is expected based on euchromatic SVs. Our data, based on the 50 Drosophila phenotypes, 44 of which are strongly deleterious, suggests a disproportionately larger contribution of SVs to deleterious changes in visible phenotypes in Drosophila .

RevDate: 2025-09-02

De Santiago A, Barnes S, Pereira TJ, et al (2025)

Pseudoalteromonas is a novel symbiont of marine invertebrates that exhibits broad patterns of phylosymbiosis.

bioRxiv : the preprint server for biology pii:2025.08.22.671635.

Despite growing insights into the composition of marine invertebrate microbiomes, our understanding of their ecological and evolutionary patterns remains poor, owing to limited sampling depth and low-resolution datasets. Previous studies have provided mixed results when evaluating patterns of phylosymbiosis between marine invertebrates and marine bacteria. Here, we investigated potential animal-microbe symbioses in Pseudoalteromonas, an overlooked bacterial genus consistently identified as a core microbiome taxon in diverse invertebrates. Using a pangenomic analysis of 236 free-living and invertebrate-associated bacterial strains (including two new nematode-associated isolates generated in this study), we confirm that Pseudoalteromonas is a novel symbiont with substantial evidence of phylosymbiosis across at least three marine invertebrate phyla (e.g., Nematoda, Mollusca, and Cnidaria). Patterns of symbiosis were consistent irrespective of geography (including in Antarctica), with FISH images from nematodes indicating that bacterial symbionts form biofilms in the mouth and esophagus. The evolutionary history of Pseudoalteromonas is marked by substantial host-switching and lifestyle transitions, and host-associated genomes suggest that these bacteria are facultative symbionts involved in nutritional mutualisms. In marine environments, we hypothesize that horizontally-acquired symbionts may have co-evolved with invertebrates, using host mucus as a physical niche and food source, while providing their animal hosts with Vitamin B, amino acids, and bioavailable carbon compounds in return.

RevDate: 2025-09-01
CmpDate: 2025-09-02

Laidoudi Y, Davoust B, Lepidi H, et al (2025)

Emergence of the zoonotic bacterium Necropsobacter rosorum in nutria Myocastor coypus with implications for wildlife and human health.

Scientific reports, 15(1):32252.

The nutria (Myocastor coypus), a semi-aquatic rodent native to South America, poses significant ecological and agricultural threats as an invasive species in France, where it continues to proliferate despite sustained control efforts. A fatal case of pneumonia in a nutria from Marseille (France) prompted a microbiological investigation that led to the isolation, taxonomic classification, genomic characterization, and phylogenetic analysis of Necropsobacter rosorum. Whole-genome sequencing of the N. rosorum strain RG01 revealed a genome size of 2,505,657 base pairs and 2303 predicted open reading frames, showing high similarity to other publicly available N. rosorum genomes. Comparative pan-genomic analysis indicated a high level of genomic conservation among N. rosorum strains. The presence of putative virulence factors and a CRISPR-Cas system suggests both pathogenic potential and adaptive defense mechanisms against bacteriophage predation. This study also explored the genetic epidemiology of members of the Pasteurellaceae family, highlighting a considerable overlap between species infecting animals and humans. Among the 408,387 sequence records retrieved from GenBank, 62.1% were deemed suitable for genomic epidemiological analysis. Notably, N. rosorum was underrepresented, with only 13 entries spanning nine countries and three host types, revealing critical gaps in current surveillance and research. Collectively, these findings contribute to a better understanding of the microbiology and epidemiology of N. rosorum and Pasteurellaceae-associated infections, and underscore the importance of integrated, genomics-informed approaches for the monitoring, control, and prevention of zoonotic diseases.

RevDate: 2025-09-01
CmpDate: 2025-09-01

Vuong TD, He G, Hu H, et al (2025)

Identification of new genomic loci for seed protein and oil content in the soybean pangenome using genome-wide association and haplotype analyses.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik, 138(9):237.

The soybean [Glycine max (L.) Merr.] pangenome has been studied and shown to be an invaluable resource for investigating structural variations (SVs), from which different genomic markers were successfully developed and employed for genome-wide association studies (GWAS). Among the SVs markers, gene presence-and-absence variations (PAVs) have been developed in soybean, but have not been widely utilized for association analyses. Here, we reported GWAS and haplotype analysis of seed protein and oil content for two diverse panels, comprised over 500 soybean accessions evaluated in multiple field environments using three marker datasets, whole genome sequence (WGS)-single-nucleotide polymorphisms (SNPs), 50 K-SNPs, and PAVs. The analyses identified new quantitative trait loci (QTL) for protein and oil content, along with the validation of previously reported QTL for these traits. This includes a well-studied QTL on chromosome (Chr.) 20 and another one on Chr. 05 for protein and/or oil. Importantly, this study is the first to report a new genomic locus for both protein and oil mapped to Chr. 08. Gene ontology annotations and expression profiles suggested candidate genes. Further analyses using haplotype-based markers led to the identification of multiple haplotype blocks encompassing candidate genes. Among these, Glyma.05G243400 on Chr. 05 and Glyma.08G109900 and Glyma.08G110000 on Chr. 08 were identified as promising targets. These genes can be incorporated into soybean breeding programs to enhance the selection of desirable protein and oil phenotypes through a haplotype-based breeding approach.

RevDate: 2025-08-30

Grujcic V, Mehrshad M, Vigil-Stenman T, et al (2025)

Stepwise genome evolution from a facultative symbiont to an endosymbiont in the N2-fixing diatom-Richelia symbioses.

Current biology : CB pii:S0960-9822(25)01034-6 [Epub ahead of print].

A few genera of diatoms that form stable partnerships with N2-fixing filamentous cyanobacteria Richelia spp. are widespread in the open ocean. A unique feature of the diatom-Richelia symbioses is the symbiont cellular location spans a continuum of integration (epibiont, periplasmic, and endobiont) that is reflected in the symbiont genome size and content. In this study, we analyzed genomes derived from cultures and environmental metagenome-assembled genomes of Richelia symbionts, focusing on characters indicative of genome evolution. Our results show an enrichment of short-length transposases and pseudogenes in the periplasmic symbiont genomes, suggesting an active and transitionary period in genome evolution. By contrast, genomes of endobionts exhibited fewer transposases and pseudogenes, reflecting advanced stages of genome reduction. Pangenome analyses identified that endobionts streamline their genomes and retain most genes in the core genome, whereas periplasmic symbionts and epibionts maintain larger flexible genomes, indicating higher genomic plasticity compared with the genomes of endobionts. Functional gene comparisons with other N2-fixing cyanobacteria revealed that Richelia endobionts have similar patterns of metabolic loss but are distinguished by the absence of specific pathways (e.g., cytochrome bd ubiquinol oxidase and lipid A) that increase both dependency and direct interactions with their respective hosts. In conclusion, our findings underscore the dynamic nature of genome reduction in N2-fixing cyanobacterial symbionts and demonstrate the diatom-Richelia symbioses as a valuable and rare model to study genome evolution in the transitional stages from a free-living facultative symbiont to a host-dependent endobiont.

RevDate: 2025-08-30
CmpDate: 2025-08-30

Arshad F, Jayaraman S, Talenti A, et al (2025)

A comprehensive water buffalo pangenome reveals extensive structural variation linked to population-specific signatures of selection.

GigaScience, 14:.

BACKGROUND: Water buffalo is a cornerstone livestock species in many low- and middle-income countries, yet major gaps persist in its genomic characterization-complicated by the divergent karyotypes of its two subspecies (swamp and river). Such genomic complexity makes water buffalo a particularly good candidate for the use of graph genomics, which can capture variation missed by linear reference approaches. However, the utility of this approach to improve water buffalo has been largely unexplored.

RESULTS: We present a comprehensive pangenome that integrates 4 newly generated, highly contiguous assemblies of Pakistani river buffalo with 8 publicly available assemblies from both subspecies. This doubles the number of accessible high-quality river buffalo genomes and provides the most contiguous assemblies for the subspecies to date. Using the pangenome to assay variation across 711 global samples, we uncovered extensive genomic diversity, including thousands of large structural variants absent from the reference genome, spanning over 140 Mb of additional sequence. We demonstrate the utility of these data by identifying putative functional indels and structural variants linked to selective sweeps in key genes involved in productivity and immune response across 26 populations.

CONCLUSIONS: This study represents one of the first successful applications of graph genomics in water buffalo and offers valuable insights into how integrating assemblies can transform analyses of water buffalo and other species with complex evolutionary histories. We anticipate that these assemblies, as well as the pangenome and putative functional structural variants we have released, will accelerate efforts to unlock water buffalo's genetic potential, improving productivity and resilience in this economically important species.

RevDate: 2025-08-30
CmpDate: 2025-08-30

Lindstrand A, J Eisfeldt (2025)

Hybrid Sequencing Characterization of Complex Chromosomal Rearrangements.

Methods in molecular biology (Clifton, N.J.), 2968:151-159.

Complex chromosomal rearrangements (CCRs), defined as structural variants involving more than two chromosomes or multiple breakpoint junctions, are challenging to resolve, and causal mutations often go unnoticed in genome studies. Short-read whole-genome sequencing enables the characterization of rearrangement junctions in unique sequences. However, issues persist within repetitive regions of the genome, which are prone to rearrangements. Therefore, complementary genome sequencing technologies may be required to solve the structures of CCRs.Hybrid sequencing, which combines multiple genome sequencing datasets from the same individual, results in a more complete representation of the genome. This approach enhances the ability to resolve rearrangement structures and map breakpoint junctions more accurately.

RevDate: 2025-08-30

Lian Q, Jiao WB, Y Wang (2025)

Designing Better Crops with Phased Pangenomes.

Molecular plant pii:S1674-2052(25)00299-0 [Epub ahead of print].

RevDate: 2025-08-29

Li W, Liang H, Sun J, et al (2025)

A Near Telomere-To-Telomere Genome Assembly and Graph-Based Pangenome of Tartary Buckwheat (Fagopyrum tataricum).

Plant biotechnology journal [Epub ahead of print].

RevDate: 2025-08-28

Cheng L, Bao Z, Kong Q, et al (2025)

Genome analyses and breeding of polyploid crops.

Nature plants [Epub ahead of print].

Polyploidization is a common and important evolutionary process in the plant kingdom. Compared with diploid plant species, the intricate genome architecture of polyploid plant species presents substantial challenges in applying multi-omics approaches for crop breeding improvement. In this Review, we summarize the current techniques for analysing polyploid genomes, including constructing reference genomes and pan-genomes, and detecting variants. We also assess findings related to polyploid genome architecture, population genetics and breeding programmes, highlighting advanced techniques in the breeding of polyploid crops. Finally, we explore the challenges and demands posed by polyploid genome complexity during analysis with available biotechnological tools. This Review emphasizes the importance of a comprehensive understanding of polyploid genomic features for the further genetic improvement of polyploid crops.

RevDate: 2025-08-28

Guo L, He Z, H Huo (2025)

Panaln: Indexing pangenome for read alignment.

Bioinformatics (Oxford, England) pii:8242760 [Epub ahead of print].

MOTIVATION: Pangenome indexing is a critical supporting technology in biological sequence analysis such as read alignment applications. The need to accurately identify billions of small sequencing fragments carrying sequencing errors and genomic variants drives the development of scalable and efficient pangenome indexing approach.

RESULTS: We propose a new wavelet tree-based approach, called Panaln, for indexing pangenome and introduce a batch computation approach for fast count query over Panaln. We present a simple and effective seeding strategy and develop a pangenome program that uses the seed-and-extend paradigm for read alignment. Experimental results on simulated and real data demonstrate that Panaln uses significantly less space for the compared pangenome methods with generally higher accuracy. We provide a scalable index construction by representing pangenome with a linear model. Additionally, Panaln brings enhanced accuracy compared to the popular single reference methods.

Package: https://anaconda.org/bioconda/panaln and source code: https://github.com/Lilu-guo/Panaln.

SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

RevDate: 2025-08-28

Skarlatoudi T, Anagnostou GM, Theodorakis V, et al (2025)

Escherichia coli Strains Originating from Raw Sheep Milk, with Special Reference to Their Genomic Characterization, Such as Virulence Factors (VFs) and Antimicrobial Resistance (AMR) Genes, Using Whole-Genome Sequencing (WGS).

Veterinary sciences, 12(8): pii:vetsci12080744.

The objective of this work was to deliver a comprehensive genetic characterization of a collection of E. coli strains isolated from raw sheep milk. To complete our purpose, the technique of whole-genome sequencing, coupled with bioinformatics and phenotypic characterization of antimicrobial resistance, was performed. These Gram-negative, facultative anaerobic bacteria belong to the family Enterobacteriaceae, together with other intestinal pathogens, such as Shigella spp. and Salmonella spp. Genetic analysis was carried out on all strains (phylogram, sequence types, VFs, AMR genes, and pangenome). The results showed the presence of various genetic traits that are related to virulence factors contributing to their pathogenic potential. In addition, genes conferring resistance to antibiotics were also detected and confirmed using phenotypic tests. Finally, the genome of the E. coli strains was characterized by the presence of several mobile genetic elements, thus facilitating the exchange of various genetic elements, associated with virulence and antimicrobial resistance, within and beyond the species, through horizontal gene transfer. Contaminated raw sheep milk with pathogenic E. coli strains is particularly alarming for cheese production in artisan dairies.

RevDate: 2025-08-28
CmpDate: 2025-08-28

Campos-Godínez JF, Villegas-Campos M, JA Molina-Mora (2025)

Core Perturbomes of Escherichia coli and Staphylococcus aureus Using a Machine Learning Approach.

Pathogens (Basel, Switzerland), 14(8): pii:pathogens14080788.

The core perturbome is defined as a central response to multiple disturbances, functioning as a complex molecular network to overcome the disruption of homeostasis under stress conditions, thereby promoting tolerance and survival under stress conditions. Based on the biological and clinical relevance of Escherichia coli and Staphylococcus aureus, we characterized their molecular responses to multiple perturbations. Gene expression data from E. coli (8815 target genes-based on a pangenome-across 132 samples) and S. aureus (3312 target genes across 156 samples) were used. Accordingly, this study aimed to identify and describe the functionality of the core perturbome of these two prokaryotic models using a machine learning approach. For this purpose, feature selection and classification algorithms (KNN, RF and SVM) were implemented to identify a subset of genes as core molecular signatures, distinguishing control and perturbation conditions. After verifying effective dimensional reduction (with median accuracies of 82.6% and 85.1% for E. coli and S. aureus, respectively), a model of molecular interactions and functional enrichment analyses was performed to characterize the selected genes. The core perturbome was composed of 55 genes (including nine hubs) for E. coli and 46 (eight hubs) for S. aureus. Well-defined interactomes were predicted for each model, which are jointly associated with enriched pathways, including energy and macromolecule metabolism, DNA/RNA and protein synthesis and degradation, transcription regulation, virulence factors, and other signaling processes. Taken together, these results may support the identification of potential therapeutic targets and biomarkers of stress responses in future studies.

RevDate: 2025-08-28

Han X, Qiu C, Gai Z, et al (2025)

Pan-Genome-Based Characterization of the PYL Transcription Factor Family in Populus.

Plants (Basel, Switzerland), 14(16): pii:plants14162541.

Abscisic acid (ABA) is a key phytohormone involved in regulating plant growth and responses to environmental stress. As receptors of ABA, pyrabactin resistance 1 (PYR)/PYR1-like (PYL) proteins play a central role in initiating ABA signal transduction. In this study, a total of 30 PopPYL genes were identified and classified into three sub-families (PYL I-III) in the pan-genome of 17 Populus species, through phylogenetic analysis. Among these subfamilies, the PYL I subfamily was the largest, comprising 21 members, whereas PYL III was the smallest, with only four members. To elucidate the evolutionary dynamics of these genes, we conducted synteny and Ka/Ks analyses. Results indicated that most PopPYL genes had undergone purifying selection (Ka/Ks < 1), while a few were subject to positive selection (Ka/Ks > 1). Promoter analysis revealed 258 cis-regulatory elements in the PYL genes of Populus euphratica (EUP) and Populus pruinosa (PRU), including 127 elements responsive to abiotic stress and 33 ABA-related elements. Furthermore, six structural variations (SVs) were detected in PYL_EUP genes and significantly influenced gene expression levels (p < 0.05). To further explore the functional roles of PYL genes, we analyzed tissue-specific expression profiles of 17 PYL_EUP genes under drought stress conditions. PYL6_EUP was predominantly expressed in roots, PYL17_EUP exhibited leaf-specific expression, and PYL1_EUP showed elevated expression in stems. These findings suggest that the drought response of PYL_EUP genes is tissue-specific. Overall, this study highlights the utility of pan-genomics in elucidating gene family evolution and suggests that PYL_EUP genes contribute to the regulation of drought stress responses in EUP, offering valuable genetic resources for functional characterization of PYL genes.

RevDate: 2025-08-28

Goche T, Mavindidze P, T Zenda (2025)

Advances in Functional Genomics for Exploring Abiotic Stress Tolerance Mechanisms in Cereals.

Plants (Basel, Switzerland), 14(16): pii:plants14162459.

Climate change, population growth and the increasing demand for food and nutritional security necessitate the development of climate-resilient cereal crops. This requires first gaining mechanistic insights into the molecular mechanisms underpinning plant abiotic and biotic stress tolerance. Although this is challenging, recent conceptual and technological advances in functional genomics, coupled with computational biology, high-throughput plant phenotyping and artificial intelligence, are now aiding our uncovering of the molecular mechanisms underlying plant stress tolerance. Integrating other innovative approaches such as genome editing, modern plant breeding and synthetic biology facilitates the development of climate-smart cereal crops. Here, we discuss major recent advances in plant functional genomic approaches and techniques such as third-generation sequencing, transcriptomics, pangenomes, genome-wide association studies and epigenomics, which have advanced our understanding of the molecular basis of stress tolerance and development of stress-resilient cereals. Further, we highlight how these genomics approaches are successfully integrated into new plant breeding methods for effective development of stress-tolerant crops. Overall, harnessing these advances and improved knowledge of crop stress tolerance could accelerate development of climate-resilient cereals for global food and nutrition security.

RevDate: 2025-08-28
CmpDate: 2025-08-28

Nunes NB, Castro VS, da Cunha-Neto A, et al (2025)

Integrated Whole-Genome Sequencing and In Silico Characterization of Salmonella Cerro and Schwarzengrund from Brazil.

Genes, 16(8): pii:genes16080880.

BACKGROUND: Salmonella is a bacterium that causes foodborne infections. This study characterized two strains isolated from cheese and beef in Brazil using whole-genome sequencing (WGS).

OBJECTIVES: We evaluated their antimicrobial resistance profiles, virulence factors, plasmid content, serotypes and phylogenetic relationships.

METHODS: DNA was extracted and sequenced on the NovaSeq 6000 platform; the pangenome was assembled using the Roary tool; and the phylogenetic tree was constructed via IQ-TREE.

RESULTS AND DISCUSSION: For contextualization and comparison, 3493 Salmonella genomes of Brazilian origin from NCBI were analyzed. In our isolates, both strains carried the aac(6')-Iaa_1 gene, while only Schwarzengrund harbored the qnrB19_1 gene and the Col440I_1 plasmid. Cerro presented the islands SPI-1, SPI-2, SPI-3, SPI-4, SPI-5 and SPI-9, while Schwarzengrund also possessed SPI-13 and SPI-14. Upon comparison with other Brazilian genomes, we observed that Cerro and Schwarzengrund represented only 0.40% and 2.03% of the national database, respectively. Furthermore, they revealed that Schwarzengrund presented higher levels of antimicrobial resistance, a finding supported by the higher frequency of plasmids in this serovar. Furthermore, national data corroborated our findings that SPI-13 and SPI-14 were absent in Cerro. A virulence analysis revealed distinct profiles: the cdtB and pltABC genes were present in the Schwarzengrund isolates, while the sseK and tldE1 family genes were exclusive to Cerro. The results indicated that the sequenced strains have pathogenic potential but exhibit low levels of antimicrobial resistance compared to national data. The greater diversity of SPIs in Schwarzengrund explains their prevalence and higher virulence potential.

CONCLUSIONS: Finally, the serovars exhibit distinct virulence profiles, which results in different clinical outcomes.

RevDate: 2025-08-28

Yinsai O, Yuantrakul S, Srisithan P, et al (2025)

Genomic Insights into Emerging Multidrug-Resistant Chryseobacterium indologenes Strains: First Report from Thailand.

Antibiotics (Basel, Switzerland), 14(8): pii:antibiotics14080746.

Background: Chryseobacterium indologenes, an environmental bacterium, is increasingly recognized as an emerging nosocomial pathogen, particularly in Asia, and is often characterized by multidrug resistance. Objectives: This study aimed to investigate the genomic features of clinical C. indologenes isolates from Maharaj Nakorn Chiang Mai Hospital, Thailand, to understand their mechanisms of multidrug resistance, virulence factors, and mobile genetic elements (MGEs). Methods: Twelve C. indologenes isolates were identified, and their antibiotic susceptibility profiles were determined. Whole genome sequencing (WGS) was performed using a hybrid approach combining Illumina short-reads and Oxford Nanopore long-reads to generate complete bacterial genomes. The hybrid assembled genomes were subsequently analyzed to detect antimicrobial resistance (AMR) genes, virulence factors, and MGEs. Results: C. indologenes isolates were primarily recovered from urine samples of hospitalized elderly male patients with underlying conditions. These isolates generally exhibited extensive drug resistance, which was subsequently explored and correlated with genomic determinants. With one exception, CMCI13 showed a lower resistance profile (Multidrug resistance, MDR). Genomic analysis revealed isolates with genome sizes of 4.83-5.00 Mb and GC content of 37.15-37.35%. Genomic characterization identified conserved resistance genes (blaIND-2, blaCIA-4, adeF, vanT, and qacG) and various virulence factors. Phylogenetic and pangenome analysis showed 11 isolates clustering closely with Chinese strain 3125, while one isolate (CMCI13) formed a distinct branch. Importantly, each isolate, except CMCI13, harbored a large genomic island (approximately 94-100 kb) carrying significant resistance genes (blaOXA-347, tetX, aadS, and ermF). The absence of this genomic island in CMCI13 correlated with its less resistant phenotype. No plasmids, integrons, or CRISPR-Cas systems were detected in any isolate. Conclusions: This study highlights the alarming emergence of multidrug-resistant C. indologenes in a hospital setting in Thailand. The genomic insights into specific resistance mechanisms, virulence factors, and potential horizontal gene transfer (HGT) events, particularly the association of a large genomic island with the XDR phenotype, underscore the critical need for continuous genomic surveillance to monitor transmission patterns and develop effective treatment strategies for this emerging pathogen.

RevDate: 2025-08-27
CmpDate: 2025-08-27

Pozzi CM, Gaiti A, A Spada (2025)

Climate change and plant genomic plasticity.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik, 138(9):231.

Genome adaptation, driven by mutations, transposable elements, and structural variations, relies on plasticity and instability. This allows populations to evolve, enhance fitness, and adapt to challenges like climate change. Genomes adapt via mutations, transposable elements, DNA structural changes, and epigenetics. Genome plasticity enhances fitness by providing the genetic variation necessary for organisms to adapt their traits and survive, which is especially critical during rapid climate shifts. This plasticity often stems from genome instability, which facilitates significant genomic alterations like duplications or deletions. While potentially harmful initially, these changes increase genetic diversity, aiding adaptation. Major genome reorganizations arise from polyploidization and horizontal gene transfer, both linked to instability. Plasticity and restructuring can modify Quantitative Trait Loci (QTLs), contributing to adaptation. Tools like landscape genomics identify climate-selected regions, resurrection ecology reveals past adaptive responses, and pangenome analysis examines a species' complete gene set. Signatures of past selection include reduced diversity and allele frequency shifts. Gene expression plasticity allows environmental adaptation without genetic change through mechanisms like alternative splicing, tailoring protein function. Co-opted transposable elements also generate genetic and regulatory diversity, contributing to genome evolution. This review consolidates these findings, repositioning genome instability not as a mere source of random error but as a fundamental evolutionary engine that provides the rapid adaptive potential required for plant survival in the face of accelerating climate change.

RevDate: 2025-08-27

Popov IV, Todorov SD, Chikindas ML, et al (2025)

Beyond White-Nose Syndrome: Mitochondrial and Functional Genomics of Pseudogymnoascus destructans.

Journal of fungi (Basel, Switzerland), 11(8):.

White-Nose Syndrome (WNS) has devastated insectivorous bat populations, particularly in North America, leading to severe ecological and economic consequences. Despite extensive research, many aspects of the evolutionary history, mitochondrial genome organization, and metabolic adaptations of its etiological agent, Pseudogymnoascus destructans, remain unexplored. Here, we present a multi-scale genomic analysis integrating pangenome reconstruction, phylogenetic inference, Bayesian divergence dating, comparative mitochondrial genomics, and refined functional annotation. Our divergence dating analysis reveals that P. destructans separated from its Antarctic relatives approximately 141 million years ago, before adapting to bat hibernacula in the Northern Hemisphere. Additionally, our refined functional annotation significantly expands the known functional landscape of P. destructans, revealing an extensive repertoire of previously uncharacterized proteins involved in carbohydrate metabolism and secondary metabolite biosynthesis-key processes that likely contribute to its pathogenic success. By providing new insights into the genomic basis of P. destructans adaptation and pathogenicity, our study refines the evolutionary framework of this fungal pathogen and creates the foundation for future research on WNS mitigation strategies.

RevDate: 2025-08-27
CmpDate: 2025-08-27

Zhu Z, N Stein (2025)

Pangenome insights into structural variation and functional diversification of barley CCT motif genes.

The plant genome, 18(3):e70098.

CONSTANS, CONSTANS-LIKE, TIMING OF CAB EXPRESSION1 (CCT) motif genes play a key role in barley (Hordeum vulgare L.) development and flowering, yet their genetic diversity remains underexplored. Leveraging a barley pangenome (76 genotypes) and pan-transcriptome (subset of 20 genotypes), we examined CCT gene variation and evolutionary dynamics. Motif-based searches, combined with genome assembly validation, revealed annotation limitations and novel frameshift variants (e.g., HvCO10, where Hv is Hordeum vulgare L.), indicating active diversification. Pangenome-wide phylogenetic analysis identified clade-specific domain expansions, including B-box domain additions in HvCO clades. Tissue-specific expression patterns further supported functional divergence among paralogs. Notably, VRN2, a canonical floral repressor associated with winter growth, was retained in spring genotypes, challenging its presumed exclusive role in vernalization. Discrepancies between VRN1 expression, VRN2 deletion, and growth habit implicated additional regulatory mechanisms. These findings highlight the power of pangenomes in resolving gene family complexity, refining annotations, and advancing the understanding of CCT genes to enhance barley resilience and adaptability.

RevDate: 2025-08-26

Kileeg Z, GA Mott (2025)

A species-wide inventory of receptor-like kinases in Arabidopsis thaliana.

BMC biology, 23(1):266 pii:10.1186/s12915-025-02364-y.

BACKGROUND: The receptor-like kinases (RLKs) are the largest family of proteins in plants. Characterized members play critical roles in diverse processes from growth to immunity, and yet the majority do not have a known function. Assigning function to RLKs poses a significant challenge due to the specificity of ligand recognition and because of the often pleiotropic or redundant functions RLKs possess. These problems inhibit the important work of identifying stress-related receptors that may be targets for crop improvement. Identification of stress-related evolutionary signatures can provide a way to expedite the discovery of candidate receptors. Pan-genome analysis can be used to compare naturally occurring variants within a species to identify evolutionary signatures that may otherwise be hidden by using only a single ecotype.

RESULTS: Using 146 ecotypes of Arabidopsis, we generated a pan-RLKome to investigate species-wide natural diversity and identify structural variation and other patterns indicative of stress adaptation. We discovered significant presence/absence variation across a subset of RLKs, most of which occurred in specific subclades nested within receptor subfamilies. These same subclades tended to have arisen through proximal or tandem duplication, both of which are common mechanisms during the expansion of stress-related genes. We also identified strong positive selection across many gene subfamilies and a bias of positive selection in the extracellular domains of receptors. This suggests escape from adaptive conflict within the extracellular domain may have played a large role in the evolution and adaptation of the RLKs.

CONCLUSION: Taken together, this work represents an excellent tool for the comparative study of RLKs and has identified lineages and subclades within RLK subfamilies with the hallmarks of involvement in stress adaptation.

RevDate: 2025-08-26
CmpDate: 2025-08-26

Maurizi L, Musleh L, Brunetti F, et al (2025)

Uropathogenic Escherichia coli (UPEC) that hides its identity: features of LC2 and EC73 strains from recurrent urinary tract infections.

BMC microbiology, 25(1):547.

BACKGROUND: Uropathogenic Escherichia coli (UPEC) strains are the major causative agents of human urinary tract infections (UTIs). Many patients who develop UTIs will experience a recurrent UTI (RUTI) within 6 months despite antibiotic-mediated clearance of the initial infection. A significant proportion of RUTIs are caused by E. coli identical to the original strain. UPEC employs several strategies to adhere, colonize, and persist within the bladder niche. Knowledge about the mechanisms regulating specific host-pathogen interactions that promote bacterial persistence is necessary to develop new approaches to RUTI diagnosis and treatment.

RESULTS: LC2 and EC73 UPEC strains were collected from patients with RUTIs. E. coli CFT073 and K-12 MG1655 were used as reference strains. UPEC displayed phenotypic profiles like those of the general E. coli population. The pan-genome analysis revealed that LC2 harbored many unique genes encoding several different functions such as intracellular trafficking and secretion, and vesicular transport. Contrarily, EC73 was the strain with the lowest number of unique genes involved in replication, recombination, repair and cell wall/membrane/envelope biogenesis. LC2 and EC73 exhibited the capacity to invade bladder monolayers efficiently and to colonize the gut of Caenorhabditis elegans, with LC2 being significantly more virulent than EC73. T24 cells infected with EC73 and LC2 strains exhibited significantly increased mRNA levels of IL-6, IL-8, IL-1β and TNF-α. EC73 elicited the strongest cytokine response. Differently, no significant cytokine mRNA induction was detected in T24 cells infected with E. coli CFT073. LC2 and EC73 modulated the expression of proteins involved in reactive oxygen species (ROS) balance in infected cells, but to different extents.

CONCLUSION: The acquisition of virulence factors by horizontal transfer of accessory DNA, other than being the cause of transformation to pathogenic strains, is responsible for the genomic plasticity. Our findings suggest that a key role in RUTIs could be played by certain bacterial strains that may benefit from peculiar abilities to adapt and potentially develop reservoirs of persistence across different host environments.

RevDate: 2025-08-25

Chaity SC, Hosen MA, Rahman SR, et al (2025)

Genomic characterization and comparative analysis of antibiotic resistance and virulence in Bangladeshi and global Klebsiella pneumoniae ST48 strains.

Journal, genetic engineering & biotechnology, 23(3):100557.

Klebsiella pneumoniae is an opportunistic pathogen associated with nosocomial infections, known for its multidrug resistance (MDR) and biofilm-forming abilities. ST48 is a particularly concerning sequence type and an emerging international clone linked to global spread and MDR infections. This study examines the comprehensive genomic epidemiology of the local and global populations of K. pneumoniae ST48 strains using whole genomes sequence data. We performed phenotypic and genotypic characterization of a K. pneumoniae strain S3C and conducted molecular epidemiological analyses of local ST48 isolates in Bangladesh, followed by pan-genome and phylogenetic analyses of 397 global ST48 strains. The S3C strain was resistant to 17 out of 19 tested antibiotics and was a moderate biofilm former. Whole genome sequencing identified it as ST48 clonal type, with 13 acquired antibiotic resistance genes, 76 virulence-associated genes, and multiple mobile genetic elements. Comparative analysis of Bangladeshi ST48 strains indicated a high prevalence of MDR genes, particularly blaCTX-M-15, and a diverse array of virulence factors associated with biofilm formation, siderophore production, capsular biosynthesis and others. Pan-genome analysis of Bangladeshi ST48 strains revealed 8,030 genes, with 56.26% classified as core genes. In contrast, global ST48 strains had 16,307 genes, with 75.3% as accessory genes, highlighting extensive genomic plasticity. The phylogenetic analysis revealed that isolates from different regions clustered within the major clade, indicating the global dissemination of this sequence type. Our findings underscore the substantial genomic diversity and high resistance levels of K. pneumoniae ST48, emphasizing the need for targeted infection control measures and continuous surveillance.

RevDate: 2025-08-25

Shahed K, Chakma A, Manjur OHB, et al (2025)

Multiscale comparative pathogenomic analysis of Vibrio anguillarum linking serotype diversity, genomic plasticity and pathogenicity.

Journal, genetic engineering & biotechnology, 23(3):100522.

Vibrio anguillarum is a major marine fish pathogen causing high mortality and potential zoonotic risks. Understanding its genomic diversity, virulence factors, and antibiotic resistance is crucial for aquaculture disease management. In this study, a comparative pan-genomic analysis of 16 V. anguillarum strains was conducted to examine core and accessory genome diversity, virulence factors, and antibiotic resistance mechanisms. The phylogenetic analysis was conducted using six core genes and SNPs to evaluate evolutionary relationships and pathogenic traits. The core genome contained 2,038 unique ORFs, while the accessory genome had 5,197 cloud genes, confirming an open pangenome. This study identified 118 pathogenic genomic islands, antibiotic resistance genes (tetracycline, quinolone, and carbapenem), and virulence factors, including type VI secretion system (T6SS) components and RTX toxins (hcp-2, vipB/mglB, rtxC). Core genes such as ftsI uncovered substantial evolutionary divergence among species, identifying more than 150 distinct SNPs. Phylogenetic analysis showed serotype-specific clustering, with O1 strains displaying genetic homogeneity, whereas O2 and O3 exhibited divergence, suggesting distinct evolutionary adaptations influencing pathogenicity and ecological interactions. These findings provide primary insights for developing molecular markers and targeted treatments for aquaculture pathogens.

RevDate: 2025-08-25

Ryan AP, Bergin S, Scully J, et al (2025)

Small pangenome of Candida parapsilosis reflects overall low intraspecific diversity.

mBio [Epub ahead of print].

Candida parapsilosis is an opportunistic yeast pathogen that can cause life-threatening infections in immunocompromised humans. Whole-genome sequencing studies of the species have demonstrated remarkably low diversity, with strains typically differing by about 1.5 single nucleotide polymorphisms (SNPs) per 10 kb. However, SNP calling alone does not capture the full extent of genetic variation. Here, we define the pangenome of 372 C. parapsilosis isolates to determine variation in gene content. The pangenome consists of 5,859 genes, of which 48 are not found in the genome of the reference strain. This includes 5,791 core genes (present in ≥99.5% of isolates). Four genes, including the allantoin permease gene DAL4, were present in all isolates but were truncated in some strains. The truncated DAL4 was classified as a pseudogene in the reference strain CDC317. CRISPR-Cas9 gene editing showed that removing the early stop codon (producing the full-length Dal4 protein) is associated with improved use of allantoin as a sole nitrogen source. We find that the accessory genome of C. parapsilosis consists of 68 homologous clusters. This includes 38 previously annotated genes, 27 novel paralogs of previously annotated genes, and 3 uncharacterized open reading frames. Approximately one-third of the accessory genome (24/68 genes) is associated with gene fusions between tandem genes in the major facilitator superfamily. Additionally, we identified two highly divergent C. parapsilosis strains and found that, despite their increased phylogenetic distance (~30 SNPs per 10 kb), both strains have similar gene content to the other 372.IMPORTANCECandida parapsilosis is a human fungal pathogen listed in the high-priority group by the World Health Organization. It is an increasing cause of hospital-acquired and drug-resistant infections. Here, we studied the genetic diversity of 372 C. parapsilosis isolates, the largest genomic surveillance of this species to date. We show that there is relatively little genetic variation. However, we identified two more distantly related isolates from Germany, suggesting that even more sampling may yield more diversity. We find that the pangenome (the cumulative gene content of all isolates) is surprisingly small, compared to other fungal species. Many of the non-core genes are involved in transport. We also find that variations in gene content are associated with nitrogen metabolism, which may contribute to the virulence characteristics of this species.

RevDate: 2025-08-25

Liu X, Zhang M, Su J, et al (2025)

The evolution, variation, and expression patterns under development and stress responses of the NAC gene family in the barley pan-genome.

Frontiers in plant science, 16:1635416.

The NAC transcription factor family is pivotal in regulating plant development and stress responses, yet its diversity and evolutionary dynamics in barley (Hordeum vulgare) remain underexplored. In this study, we performed a comprehensive pan-genome analysis to identify and characterize the HvNACs across 20 barley accessions. A ranging from 127 to 149 HvNACs were identified in each genome, in which the Morex genome harbored the highest count. These HvNACs were classified into 201 orthogroups, further stratified into core (102), soft-core (18), shell (25), and lineage-specific (56) categories. Phylogenetic analysis delineated them into 12 subfamilies, of which the core genes have undergone strong purifying selection, by contrast, the shell and lineage-specific genes were under relaxed selection constraint, suggesting functional diversification in barley. Genomic variation, such as PAVs and CNVs, largely driven by TEs, highlighted the dynamic nature of NAC loci. Furthermore, transcriptome profiling of the HvNACs demonstrated diverse tissue expression patterns and different response characteristics under salt stress. These findings elucidate the evolutionary and functional dynamics of HvNACs, offering valuable insights for genetic improvement of breeding programs in barley as well as in other crops.

RevDate: 2025-08-23

Jayachandiran S, Suresh R, R Dhamodharan (2025)

Comparative and phylogenomic analysis of Chlamydia pneumoniae reveals unique carbohydrate active enzyme family (GT5) among respiratory isolates.

Infection, genetics and evolution : journal of molecular epidemiology and evolutionary genetics in infectious diseases pii:S1567-1348(25)00102-9 [Epub ahead of print].

Chlamydia pneumoniae is an obligatory intracellular pathogen found in human and animals. Understanding the genomic diversity is crucial for unravelling its pathogenic mechanisms and transmission dynamics. In this study, 14 complete genomes of C. pneumoniae strains were compared for functional diversity analysis. The koala isolate LPCoLN appears as a phylogenetically distinct showing the fewest accessory genes and the highest incorporation of unique or absent genes among the strains analyzed. Functional annotation indicates that certain metabolic pathways between the LPCoLN and the human respiratory strain AR39 were the same, which is most likely due to phage-associated elements present in AR39. The presence of the GT5 CAZyme family is significantly associated with strains of respiratory origin, suggesting a potential role in respiratory adaptation and pathogenic strategies including tissue colonization, immune evasion, and niche-specific persistence. The strong association between GT5 CAZymes and respiratory-origin strains highlights their potential as diagnostic markers and therapeutic targets.

RevDate: 2025-08-23

Dishuck PC, Munson KM, Lewis AP, et al (2025)

Structural variation, selection, and diversification of the NPIP gene family from the human pangenome.

Cell genomics pii:S2666-979X(25)00233-2 [Epub ahead of print].

The NPIP gene family is among the most positively selected gene families in humans/apes and drives independent duplication in primate lineages. These duplications promote genetic instability, leading to recurrent disease-associated microduplication and microdeletion syndromes. Despite its importance, little is known about its function or variation in humans, as short-read sequencing cannot distinguish high-identity duplications. Using long-read assemblies of 169 human haplotypes, we find extreme variation in the content and organization of NPIP loci. We identify fixed and polymorphic paralogs and observe ongoing positive selection. With long-read RNA sequencing (RNA-seq), we create paralog-specific gene models, the majority of which were not previously documented, and observe paralog-specific tissue specificity. This analysis of an exceptionally dynamic gene family provides candidates for future functional study.

RevDate: 2025-08-23
CmpDate: 2025-08-23

Fouéré C, Costes V, Hozé C, et al (2025)

Genetic regulation of sperm DNA methylation in cattle through meQTL mapping.

BMC genomics, 26(1):771.

BACKGROUND: DNA methylation (DNAm) plays an important functional role and is influenced by genetic variants known as methylation QTLs (meQTLs). The majority of meQTL studies have been conducted in human blood. Despite its unique landscape, the genetic regulation of sperm DNAm remains largely unexplored. In this study, we leveraged DNAm measured in sperm from 405 Holstein bulls using reduced representation bisulfite sequencing (RRBS) and performed sequence-level genome-wide association studies for 166,985 variable CpGs (s.d. >5%). We reported heritability estimates and have mapped both cis-meQTLs and trans-meQTLs.

RESULTS: Heritability estimates ranged from 0 to 1 and averaged 0.26 across all selected CpGs, with 76% of estimates above 0.1. The meQTL mapping revealed that 32.9% of the CpGs had a cis-meQTL, 3.6% had a trans-meQTL and 1.0% had both cis- and trans-meQTLs. The cis-CpGs were located on average 261 kb (absolute mean) from their cis-meQTL top SNPs (defined by the most significant association). MeQTLs were enriched in featured genomic annotations, including regions surrounding transcription start sites and ATAC-seq peaks. We also identified spurious trans-associations by analyzing data across multiple genome assemblies, including the construction of a partial pangenome. Additionally, eight trans-meQTL hotspots, defined as variants associated with at least 30 trans-CpGs, were identified and overlapped with genes involved in epigenetic regulation. Using peripheral blood mononuclear cell DNAm from 54 out of the 405 bulls, we did not observe a similar effect of the trans-meQTL hotspots to that one observed in sperm.

CONCLUSIONS: For the first time, meQTLs have been detected and characterized in bovine sperm, contributing to a better understanding of the transmission of paternally inherited DNAm marks. These findings provide useful information for further research aimed at integrating epigenetic information into the prediction of performance traits.

RevDate: 2025-08-22

Le MH, Proctor M, JP Huang (2025)

Chromosome Level Genome Assembly of Dynastes reidi Reveals Structural Evolution of Autosomes and the Sex Chromosomes in Hercules Beetles.

G3 (Bethesda, Md.) pii:8239991 [Epub ahead of print].

The Hercules beetles have long been iconic symbols of evolutionary diversification, sexual selection, and systematics. Despite their rapid phenotypic evolution and a rich history of inspiring evolutionary biologists, genomic resources for these charismatic beetles remain limited, especially for the Giant Hercules beetles. We present the first chromosome-level genome assembly of a Giant Hercules beetle from the Lesser Antilles. The assembled genome is approximately 837 Mb in size, with a scaffold N50 of 66.68 Mb, which can be anchored to 11 pseudochromosomes with a BUSCO completeness score of 95.9%. An estimate of 55.5% of the genome can be attributed to repetitive elements. Additionally, we detected candidate sex-linked chromosomes by comparing sequencing read depths between one male and two females using Illumina short reads. The chromosome-level genome assembly of Dynastes reidi not only provides critical insights into evolutionary and functional genomics, but also supports informed conservation and management efforts. In addition, this genomic resource will enable future pangenome analyses aimed at understanding the genetic basis of species divergence and morphological innovation in beetles. Our study also marks the emergence of a new model system to investigate the origin and diversification of phenotypic novelty by leveraging genomic resources across diverse domesticated beetle breeds.

RevDate: 2025-08-22
CmpDate: 2025-08-22

Gonzalez-Reyes M, Ramos-Tapia I, JA Ugalde (2025)

A global perspective on the genomics of Moraxella catarrhalis.

Microbial genomics, 11(8):.

Moraxella catarrhalis is an opportunistic pathogen of the human respiratory tract, primarily associated with otitis media in children and exacerbations of chronic obstructive pulmonary disease in adults. Despite its clinical importance, the genomic diversity and functional specialization of M. catarrhalis remain insufficiently characterized. This study aimed to analyse the global genetic diversity of M. catarrhalis using whole-genome sequencing to identify phylogenetic lineages, antimicrobial resistance patterns and key virulence factors. Phylogenomic analysis of 345 publicly available genomes identified 3 phylogroups, of which 1 exhibited significant genomic divergence and was excluded from further analyses due to its potential classification as a separate species. The remaining two phylogroups corresponded to previously described seroresistant and serosensitive lineages. Phylogroup B exhibited a higher prevalence of antimicrobial resistance genes, particularly bro-1 and bro-2, while phylogroup A exhibited unique metabolic adaptation, including genes encoding for the DppB-DppC-DppD dipeptide transport system. Both phylogroups shared crucial virulence factors, including UspA1 and UspA2, which facilitate adhesion and immune evasion. Potential therapeutic targets were identified, including PilQ, essential for type IV pilus biogenesis, and CopB, which plays a key role in iron acquisition and immune evasion. Overall, these findings highlight the significance of phylogenomics approaches in elucidating the genetic mechanisms underlying pathogenicity and resistance in M. catarrhalis, providing insights for future therapeutic and preventive strategies.

RevDate: 2025-08-21

Chandra G, Hossen MH, Scholz S, et al (2025)

Pangenome-based genome inference using integer programming.

Genome research pii:gr.280567.125 [Epub ahead of print].

Affordable genotyping methods are essential in genomics. Commonly used genotyping methods primarily support single nucleotide variants and short indels but neglect structural variants. Additionally, accuracy of read alignments to a reference genome is unreliable in highly polymorphic and repetitive regions, further impacting genotyping performance. Recent works highlight the advantage of haplotype-resolved pangenome graphs in addressing these challenges. Building on these developments, we propose a rigorous alignment-free genotyping method. Our optimization framework identifies a path through the pangenome graph that maximizes the matches between the path and substrings of sequencing reads (e.g., k-mers) while minimizing recombination events (haplotype switches) along the path. We prove that this problem is NP-Hard and develop efficient integer-programming solutions. We benchmarked the algorithm using downsampled short-read datasets from homozygous human cell lines with coverage ranging from 0.1× to 10×. Our algorithm accurately estimates complete major histocompatibility complex (MHC) haplotype sequences with small edit distances from the ground-truth sequences, providing a significant advantage over existing methods on low-coverage inputs. While this algorithm is designed for haploid genomes, we discuss directions for extending it to diploid genotyping.

RevDate: 2025-08-20

Zhang H, Liu N, Wang Y, et al (2025)

Super-pangenome analyses across 35 accessions of 23 Avena species highlight their complex evolutionary history and extensive genomic diversity.

Nature genetics [Epub ahead of print].

Common oat, belonging to the genus Avena with 30 recognized species, is a nutritionally important cereal crop and high-quality forage worldwide. Here, we construct a genus-level super-pangenome of Avena comprising 35 high-quality genomes from 14 cultivated oat accessions and 21 wild species. The fully resolved phylogenomic analysis unveils the origin and evolutionary scenario of Avena species, and the super-pangenome analysis identifies 26.62% and 59.93% specific genes and haplotypes in wild species. We delineate the landscape of structural variations (SVs) and the transcriptome profile based 1,401 RNA-sequencing (RNA-seq) samples from diverse abiotic stress treatments in oat. We highlight the crucial role of SVs in modulating gene expression and shaping adaptation to diverse stresses. Further combining SV-based genome-wide association studies (GWASs), we characterize 13 candidate genes associated with drought resistance such as AsARF7, validated by transgenic oat lines. Our study provides unprecedented genomic resources to facilitate genomic, evolution and molecular breeding research in oat.

RevDate: 2025-08-20
CmpDate: 2025-08-20

Han C, Lu S, Hu P, et al (2025)

Machine learning based on pangenome-wide association studies reveals the impact of host source on the zoonotic potential of closely related bacterial pathogens.

Communications biology, 8(1):1253.

Variations in host species significantly impact bacterial growth traits and antibiotic resistance, making it essential to consider host origin when evaluating the zoonotic potential of pathogens. This study focuses on multiple Brucella species, which share highly similar genetic material, to explore the relationship between host origin and zoonotic potential by integrating pan-genome-wide association studies (pan-GWAS) with machine learning (ML). Our results present an open pangenome of Brucella spp. derived from the whole-genome sequencing (WGS) data of 991 strains and identify 268 genes potentially associated with the zoonotic potential of Brucella. Integrating these genes into an ML model based on the support vector machine (SVM) algorithm allows us to predict the zoonotic potential of various Brucella strains with high accuracy. Our findings reveal that zoonotic potential varies by host origin: Brucella melitensis strains isolated from humans exhibit higher zoonotic potential than those isolated from cattle, goats, and sheep, while Brucella suis biovar 2 strains isolated from domestic pigs display higher zoonotic potential than those isolated from wild boars. Our study proposes a method for predicting and quantifying the zoonotic potential of closely related bacterial pathogens from different host origins, providing valuable insights for risk assessment and public health strategy.

RevDate: 2025-08-19

Igolkina AA, Vorbrugg S, Rabanal FA, et al (2025)

A comparison of 27 Arabidopsis thaliana genomes and the path toward an unbiased characterization of genetic polymorphism.

Nature genetics [Epub ahead of print].

Making sense of whole-genome polymorphism data is challenging, but it is essential for overcoming the biases in SNP data. Here we analyze 27 genomes of Arabidopsis thaliana to illustrate these issues. Genome size variation is mostly due to tandem repeat regions that are difficult to assemble. However, while the rest of the genome varies little in length, it is full of structural variants, mostly due to transposon insertions. Because of this, the pangenome coordinate system grows rapidly with sample size and ultimately becomes 70% larger than the size of any single genome, even for n = 27. Finally, we show how short-read data are biased by read mapping. SNP calling is biased by the choice of reference genome, and both transcriptome and methylome profiling results are affected by mapping reads to a reference genome rather than to the genome of the assayed individual.

RevDate: 2025-08-19

Sibbald SJ, Lawton M, Maclean C, et al (2025)

Pangenome biology and evolution in harmful algal-bloom-forming pelagophytes.

Current biology : CB pii:S0960-9822(25)00964-9 [Epub ahead of print].

In prokaryotes, lateral gene transfer (LGT) is a key mechanism leading to intraspecies variability in gene content and the phenomenon of pangenomes. In microbial eukaryotes, however, the extent to which LGT-driven pangenomes exist is unclear. Pelagophytes are ecologically important marine algae that include Aureococcus anophagefferens-a species notorious for causing harmful algal blooms. To investigate genome evolution across Pelagophyceae and within Ac. anophagefferens, we used long-read sequencing to produce high-quality genome assemblies for five strains of Ac. anophagefferens (52-54 megabase pairs [Mbp]), a telomere-to-telomere assembly for Pelagomonas calceolata (32 Mbp), and the first reference genome for Aureoumbra lagunensis (41 Mbp). Using comparative genomics and phylogenetics, we show remarkable strain-level genetic variation in Ac. anophagefferens, with a pangenome (23,356 orthogroups) that is 81.1% core and 18.9% accessory. Although gene content variation within Ac. anophagefferens does not appear to be largely driven by recent prokaryotic LGTs (2.6% of accessory orthogroups), 368 orthogroups were acquired from bacteria in a common ancestor of all analyzed strains and are not found in P. calceolata or Au. lagunensis. A total of 1,077 recent LGTs from prokaryotes and viruses were identified within Pelagophyceae overall, constituting 3.5%-4.0% of the orthogroups in each species. This includes genes likely contributing to the ecological success of pelagophytes globally and in long-lasting harmful blooms.

RevDate: 2025-08-19

Dai S, Zhao P, Li W, et al (2025)

Global pangenome analysis highlights the critical role of structural variants in cattle improvement and identifies a unique event as a novel enhancer in IGFBP7[+] cells.

Molecular biology and evolution pii:8238201 [Epub ahead of print].

Based on a pangenome graph platform, we simultaneously analyzed the impacts of SNPs and SVs in the population structure and phenotypic formation of global cattle using 2,409 individuals from 82 breeds. We demonstrated that SVs, like SNPs, effectively explain the population structure of global cattle. Genomic regions under strong selection, identified using both SNPs and SVs, consistently revealed footprints associated with human-mediated selection of economic traits in European improved cattle or natural selection of geographical adaptations. Notably, we detected that ∼40.14% of SVs were not tagged (LD, r2 < 0.6) by nearby SNPs. These "orphan" SVs may uncover new genetic signals and represent recent mutations associated with specific selection pressures or local environmental adaptation. Selected SVs tagged by SNPs also play causal or dominant roles in regions under selection. For example, our single-cell RNA sequencing has demonstrated that a notable SNP-tagged SV functions as an enhancer of the IGFBP7 gene, regulating fat deposition through IGFBP7+ cells. In conclusion, these SV-related mechanisms likely have caused some differences in economic traits and local adaptability across global cattle populations. Our integrated approaches highlight the unique and indispensable roles of SVs in shaping genetic diversity, offering novel insights into adaptation, selection, and strategies for improving cattle populations.

RevDate: 2025-08-19

Kenney SM, M'ikanatha NM, E Ganda (2025)

Genomic evolution of Salmonella Dublin in cattle and humans in the United States.

Applied and environmental microbiology [Epub ahead of print].

Increasingly, antimicrobial-resistant (AMR) Salmonella Dublin is a threat to human and animal health, therefore requiring a One Health approach to comprehensively understand pathogen evolution. Moreover, S. Dublin dissemination throughout the United States and the food supply chain is a concern for food safety and security. Here, we leveraged multi-agency biosurveillance data and genomic sequencing of S. Dublin strains to provide a robust analysis of its evolution across human, animal, and environmental reservoirs. This study advances our understanding of AMR S. Dublin, elucidates factors driving AMR emergence, and informs interventions to protect public health. In total, 2,150 strains collected between 2002 and 2023 throughout the United States from clinical bovine (N = 581), clinical human (N = 664), and environmental (N = 905) sources were identified. After uniform quality control, raw reads were assembled de novo followed by genome annotation and characterization of plasmids, antimicrobial resistance genes, and virulence factors. Strain relatedness was evaluated using a core genome maximum-likelihood phylogeny and pairwise core genome single-nucleotide polymorphism (SNP) differences. We identified the highest prevalence of drug-specific antimicrobial resistance genes and multidrug resistance plasmid, IncA/C2 (P < 0.001), in bovine clinical strains, which also had the greatest genetic diversity. Despite source-dependent differences in antimicrobial resistance gene frequency and types, 72% of S. Dublin strains in our study differed with at least one other strain by 20 or fewer SNPs. This high degree of genomic similarity highlights the potential for cross-transmission between humans, animals, and the environment and underscores the importance of considering strain source when assessing and monitoring antimicrobial resistance.IMPORTANCESalmonella Dublin is a zoonotic, sometimes foodborne, pathogen that causes severe illness in cattle and humans. Our study takes a One Health approach to understanding genetic differences in strains within and between different reservoirs in the United States. We identified differences in antimicrobial resistance potential and genome content between clinical bovine, clinical human, and environmental strains. Nonetheless, the U.S. population of S. Dublin is highly related and diverges minimally over time and geography. These findings highlight the importance of the One Health framework when combating zoonotic antimicrobial-resistant pathogens like Salmonella Dublin.

RevDate: 2025-08-14

Alanko JN, Biagi E, SJ Puglisi (2025)

Finimizers: Variable-Length Bounded-Frequency Minimizers for $k$-mer Sets.

IEEE transactions on computational biology and bioinformatics, 22(2):899-910.

The minimizer of a $k$-mer is the smallest $m$-mer inside the $k$-mer according to some total order $< $ of the $m$-mers. Minimizers are often used as keys in hash tables in indexing tasks in metagenomics and pangenomics. The main weakness of minimizer-based indexing is the possibility of very frequently occurring minimizers, which can slow query times down significantly. Popular minimizer alignment tools employ various and often wild heuristics as workarounds, typically by ignoring frequent minimizers or blacklisting commonly occurring patterns, to the detriment of other metrics (e.g., alignment recall, space usage, or code complexity). In this paper, we introduce frequency-bounded minimizers, which we call finimizers, for indexing sets of $k$-mers. The idea is to use an order relation $< $ for minimizer comparison that depends on the frequency of the minimizers within the indexed $k$-mers. With finimizers, the length $m$ of the $m$-mers is not fixed, but is allowed to vary depending on the context, so that the length can increase to bring the frequency down below a user-specified threshold $t$. Setting a maximum frequency solves the issue of very frequent minimizers and gives us a worst-case guarantee for the query time. We show how to implement a particular finimizer scheme efficiently using the Spectral Burrows-Wheeler transform ($SBWT$) (Alanko et al. Proc. SIAM ACDA, 2023) augmented with longest common suffix information. In experiments, we explore in detail the special case in which we set $t = 1$. This choice simplifies the index structure and makes the scheme completely parameter-free apart from the choice of $k$. A prototype implementation of this scheme exhibits $k$-mer localization times close to, and often faster than, state-of-the-art minimizer-based schemes.

RevDate: 2025-08-13

Pushkarna S, Kumar A, Arora K, et al (2025)

Exploring the potential of Lactobacillus rhamnosus as gluten-digesting bacteria.

Irish journal of medical science [Epub ahead of print].

BACKGROUND: Celiac disease (CeD), a multifactorial disorder, develops when gluten, the toxic environmental inducer, interacts with CeD susceptibility genetic markers, resulting in a chronic enteropathy. Several extra-intestinal complications may also arise in cases of delayed management. There persists a growing demand to develop non-dietary adjuvant therapeutic options that can help relieve symptoms and improve patients' quality of life.

AIM: The present study conducted a bioinformatic approach to look into the potential of using Lactobacillus rhamnosus, a well-established probiotic, as gluten-digesting bacteria and provide the basis for future therapeutic developments.

METHODS: Complete genome assemblies of forty-nine L. rhamnosus strains were subjected to annotation using RAST and a pan genome analysis with BPGA. Genes for peptidases were identified using BlastKOALA and Prokka, followed by domain analysis using the NCBI-CD search tool to screen for gluten-digesting activity.

RESULTS: Genome annotation of all the strains under study highlighted the presence of sixty-one peptidases in L. rhamnosus. Domain analysis further revealed that nine of these peptidases, including aminopeptidase N, neutral endopeptidase, oligoendopeptidase F, dipeptidyl-peptidase 5, proline iminopeptidase, Xaa-Pro dipeptidyl-peptidase, aminopeptidase C, aminopeptidase E, and PII-type proteinase, shared domains with already established gluten-digesting enzymes, suggesting their potential role in degrading toxic gliadin peptides.

CONCLUSION: The current in silico analysis indicates that this well-known probiotic species, in addition to showcasing a plethora of beneficial properties, may also hold great potential in terms of reducing gluten toxicity. With further studies, L. rhamnosus can prove to be a promising candidate in CeD treatment and management.

RevDate: 2025-08-14

Wei CR, Basharat Z, P Adhikari (2025)

Implications of virtual screening for South African natural compounds against Plesiomonas shigelloides, a pathogen with zoonotic potential.

Computers in biology and medicine, 196(Pt B):110882.

Plesiomonas shigelloides is an emerging pathogen associated with gastroenteritis and poses a growing public health concern, especially in regions with limited access to advanced medical treatments. The purpose of this study was to explore the therapeutic potential of South African natural product compounds against P. shigelloides by targeting the essential enzyme Pyridoxine 5'-phosphate synthase or PPS (encoded by PdxJ). P. shigelloides proteomes (n = 26) were processed using the Bacterial Pan Genome Analysis (BPGA) pipeline to identify conserved targets. Targeting conserved protein ensures the potential for broad-spectrum efficacy. PPS was chosen as drug target and its structure was predicted using AlphaFold, enabling high-confidence modeling. Subsequently, docking was performed using AutoDock Vina, focusing on a library of South African compounds (n > 1000). The three inhibitors demonstrating strong binding affinities to the PPS were Scutiaquinone A, Mesquitol-(4α→5)-3,3',4',7,8-pentahydroxyflavonone, and Riccardin C. To further validate the stability and efficacy of these interactions, molecular dynamics (MD) simulations were carried out for 100 ns. The simulations revealed stable interactions between the inhibitors and PPS, suggesting potential inhibition of the PPS enzyme. Mesquitol derivative was found to be the safest and recommended for further experimental validation. This study highlights the promising potential of South African natural compounds in combating P. shigelloides infections, paving the way for the development of novel therapeutic strategies.

RevDate: 2024-03-13
CmpDate: 2024-03-01

Deery J, Carmody M, Flavin R, et al (2024)

Comparative genomics reveals distinct diversification patterns among LysR-type transcriptional regulators in the ESKAPE pathogen Pseudomonas aeruginosa.

Microbial genomics, 10(2):.

Pseudomonas aeruginosa, a harmful nosocomial pathogen associated with cystic fibrosis and burn wounds, encodes for a large number of LysR-type transcriptional regulator proteins. To understand how and why LTTR proteins evolved with such frequency and to establish whether any relationships exist within the distribution we set out to identify the patterns underpinning LTTR distribution in P. aeruginosa and to uncover cluster-based relationships within the pangenome. Comparative genomic studies revealed that in the JGI IMG database alone ~86 000 LTTRs are present across the sequenced genomes (n=699). They are widely distributed across the species, with core LTTRs present in >93 % of the genomes and accessory LTTRs present in <7 %. Analysis showed that subsets of core LTTRs can be classified as either variable (typically specific to P. aeruginosa) or conserved (and found to be distributed in other Pseudomonas species). Extending the analysis to the more extensive Pseudomonas database, PA14 rooted analysis confirmed the diversification patterns and revealed PqsR, the receptor for the Pseudomonas quinolone signal (PQS) and 2-heptyl-4-quinolone (HHQ) quorum-sensing signals, to be amongst the most variable in the dataset. Successful complementation of the PAO1 pqsR [-] mutant using representative variant pqsR sequences suggests a degree of structural promiscuity within the most variable of LTTRs, several of which play a prominent role in signalling and communication. These findings provide a new insight into the diversification of LTTR proteins within the P. aeruginosa species and suggests a functional significance to the cluster, conservation and distribution patterns identified.

RevDate: 2025-08-14
CmpDate: 2023-11-16

Bouznada K, Belaouni HA, A Meklat (2023)

Genome-based reclassification of Kitasatospora niigatensis as a later heterotypic synonym of Kitasatospora cineracea Tajima et al. (2001).

Antonie van Leeuwenhoek, 116(12):1327-1335.

The present study used genome-based approaches to investigate the taxonomic relationship between Kitasatospora cineracea DSM 44780[T] and Kitasatospora niigatensis DSM 44781[T], two species that were previously described by Tajima et al. (Int J Syst Evol Microbiol 51:1765-1771, 2001). The digital DNA-DNA hybridization (dDDH), average amino acid identity (AAI), and average nucleotide identity (ANI) values between the genomes of the two type strains were 90.3, 98.7, and 99.1%, respectively. These values exceeded the established thresholds of 70% (dDDH) and 95-96% (ANI and AAI) for bacterial species delineation, suggesting that K. cineracea and K. niigatensis should share the same taxonomic position. Furthermore, our analysis using the 'Bacterial Pan Genome Analysis' (BPGA) pipeline and the Maximum Likelihood core-genes tree inferred using FastTree2 consistently demonstrated that K. cineracea DSM 44780[T] and K. niigatensis DSM 44781[T] are closely related, as indicated by the clustering of these strains in the core-genes phylogenomic tree. Based on these findings, we propose that K. niigatensis should be considered a later heterotypic synonym of K. cineracea.

RevDate: 2023-11-10
CmpDate: 2015-05-28

Islam MA, Waller AS, Hug LA, et al (2014)

New insights into Dehalococcoides mccartyi metabolism from a reconstructed metabolic network-based systems-level analysis of D. mccartyi transcriptomes.

PloS one, 9(4):e94808.

Organohalide respiration, mediated by Dehalococcoides mccartyi, is a useful bioremediation process that transforms ground water pollutants and known human carcinogens such as trichloroethene and vinyl chloride into benign ethenes. Successful application of this process depends on the fundamental understanding of the respiration and metabolism of D. mccartyi. Reductive dehalogenases, encoded by rdhA genes of these anaerobic bacteria, exclusively catalyze organohalide respiration and drive metabolism. To better elucidate D. mccartyi metabolism and physiology, we analyzed available transcriptomic data for a pure isolate (Dehalococcoides mccartyi strain 195) and a mixed microbial consortium (KB-1) using the previously developed pan-genome-scale reconstructed metabolic network of D. mccartyi. The transcriptomic data, together with available proteomic data helped confirm transcription and expression of the majority genes in D. mccartyi genomes. A composite genome of two highly similar D. mccartyi strains (KB-1 Dhc) from the KB-1 metagenome sequence was constructed, and operon prediction was conducted for this composite genome and other single genomes. This operon analysis, together with the quality threshold clustering analysis of transcriptomic data helped generate experimentally testable hypotheses regarding the function of a number of hypothetical proteins and the poorly understood mechanism of energy conservation in D. mccartyi. We also identified functionally enriched important clusters (13 for strain 195 and 11 for KB-1 Dhc) of co-expressed metabolic genes using information from the reconstructed metabolic network. This analysis highlighted some metabolic genes and processes, including lipid metabolism, energy metabolism, and transport that potentially play important roles in organohalide respiration. Overall, this study shows the importance of an organism's metabolic reconstruction in analyzing various "omics" data to obtain improved understanding of the metabolism and physiology of the organism.

RevDate: 2025-08-18

Näpflin N, Schubert C, Malfertheiner L, et al (2025)

Gene-level analysis of core carbohydrate metabolism across the Enterobacteriaceae pan-genome.

Communications biology, 8(1):1241.

Enterobacteriaceae is a diverse bacterial family that commonly colonizes the gastrointestinal tracts of humans and animals, influences host health, and also includes members adapted to colonize the phyllosphere as well as insect hosts. We lack systematic knowledge regarding the core metabolic strategy shared among Enterobacteriaceae. To address this gap, we have analyzed the pan-genome of nearly 20,000 genomes, including Citrobacter, Escherichia, Klebsiella, and Salmonella. We found that genes necessary for monosaccharide-fuelled mixed acid fermentation and (micro-)aerobic respiration are part of the Enterobacteriaceae core genome, whereas most genes involved in anaerobic respiration and carbohydrate utilization are associated to the accessory genome. Most Enterobacteriaceae possess genes enabling the utilization of D-glucose, its epimers, D-glucose-containing disaccharides, and chemically modified derivatives of D-glucose - highlighting the evolutionary adaptation of this family to efficiently exploit this simple sugar. Understanding Enterobacteriaceae's core metabolic strategy helps clarify the distinction of niche-defining nutrient sources, which can be genus-, species- or strain-specific. This study highlights the core metabolic strategy of Enterobacteriaceae, supporting the development of targeted interventions in microbiome research and infectious disease control.

RevDate: 2025-08-18

Rosani U, Gerdol M, M Krupovic (2025)

The highly dynamic pangenome of basal chordates is enriched in defence and immunity genes and is inherited following the Mendelian law.

PLoS genetics, 21(8):e1011833 pii:PGENETICS-D-25-00112 [Epub ahead of print].

Pangenome analyses, which encompass the full genetic repertoire of a species, offer valuable insights into intraspecific diversity and phylogeographic gene patterns. While the taxonomic breadth and functional significance of animal pangenomes remain to be fully uncovered, recent findings (such as reports of open, bacterial-like pangenomes in bivalves) highlight the need to better understand the molecular mechanisms driving inter-haplotype structural variation. Genes affected by presence-absence variation (PAV), along with non-reference sequences (NRSs), represent evolutionary footprints that may shape genome architecture and plasticity, ultimately influencing the adaptability and long-term fitness of species. To investigate the pangenomic architecture of basal chordates, we analyzed available whole-genome resequencing data from Branchiostoma belcheri and B. floridae, examined the impact of structural genomic variation, and assessed the inheritance patterns of dispensable genes across generations. The pangenomes of both species include over a thousand of genes affected by PAV and exhibiting trans-generational Mendelian transmission from parents to offspring. We further demonstrate that 35 dispensable genes in B. belcheri are of exogenous origin, likely resulting from the integration of a malacoherpesvirus genome, thereby extending the known host range of Malacoherpesviridae from invertebrates to chordates. PAV preferentially targeted gene families involved in defense, immunity, and cell signaling, including GTPases of immunity-associated proteins (GIMAPs), caspases, toll-like receptors, and pattern recognition receptors containing apextrin C-terminal (APEC) domains. The dynamic nature of immunity genes in cephalochordates parallels patterns seen in open bacterial pangenomes, suggesting that fundamental principles of genome evolution and innovation across life domains are shaped by host-pathogen interactions.

RevDate: 2025-08-18

Sadler MC, Wietz M, Mino S, et al (2025)

Genomic diversity and adaptation in Arctic marine bacteria.

mBio [Epub ahead of print].

Arctic marine bacteria experience seasonal changes in temperature, salinity, light, and sea ice cover. Time-series and metagenomic studies have identified spatiotemporal patterns in Arctic microbial communities, but a lack of complete genomes has limited efforts to identify the extent of genomic diversity in Arctic populations. We cultured and sequenced the complete genomes of 34 Arctic marine bacteria to identify patterns of gene gain, loss, and rearrangement that structure genomes and underlie adaptations to Arctic conditions. We found that the most abundant lineage in the Arctic (SAR11) is comprised of diverse species and subspecies, each encoding 50-150 unique genes. Half of the 16 SAR11 genomes harbor a genomic island with the potential to enhance survival in the Arctic by utilizing the osmoprotectant and potential methyl donor glycine betaine. We also cultured and sequenced four species representing an uncultured family of Pseudomonadales, four subspecies of Pseudothioglobus (SUP05), a genus of high GC Puniceispirillales (SAR116), and a family of low GC SAR116. Time-series 16S rRNA amplicon data indicate that this culture collection represents up to 60% of the marine bacterial community in Arctic waters. Their genomes provide insights into the evolutionary processes that underlie bacterial diversity and adaptation to Arctic waters.IMPORTANCEGenetic diversity has limited efforts to assemble and compare whole genomes from natural populations of marine bacteria. We developed a cultivation-based population genomics approach to culture and sequence the complete genomes of bacteria from the Arctic Ocean. Cultures and closed genomes obtained in this study represent previously uncultured families, genera, and species from the most abundant lineages of bacteria in the Arctic. We report patterns of gene gain, loss, rearrangement, and adaptation in the dominant lineage (SAR11), as well as the size, composition, and structure of genomes from several other groups of marine bacteria. This work demonstrates the potential for cultivation-based high-throughput genomics to enhance understanding of the processes underlying genomic diversity and adaptation.

RevDate: 2025-08-18

Henry JA (2024)

Population health management genomic new-born screens and multi-omics intercepts.

Frontiers in artificial intelligence, 7:1496942.

INTRODUCTION: The Population Health Management (PHM) Genomic Newborn Screens (GNBS) and Multi-Omics Intercepts for Human Phenotype Ontology (HPO) using Federated Data Platforms (FDP) represent a groundbreaking innovation in global health. This reform, supported by the UK's Genomic Medical Services (GMS) through "The Generation Study," aims to significantly reduce infant mortality by identifying and managing over 200 rare diseases from birth, paving the way for personalised health planning.

METHODS: Using an ecosystem approach, this study evaluates a diverse pangenome to predict health outcomes or confirm diagnoses prior to symptomatic manifestations. GNBS standardises care by integrating diagnostic techniques such as blood spot analysis and full blood cell diagnostics to stratify risk. The approach enhances the understanding of rare diseases in primary care medicine, with biomedical and haematology diagnoses re-evaluated. Scientific proof of concept and fit-for-purpose technology align multi-omics in pre-eXams (X = Gen AI).

RECOMMENDATIONS: The Digital Regulation Service (DRS) assembles an agile group of experts to enhance medical science through human phenotype ontology (HPO) for precise disease segmentation, scheduling accurate eXam intercepts where needed. This team strategically plans regulation services for digital HPO eXam assurance and implements Higher Expert Medical Science Safety (HEMSS) frameworks. The DRS is responsible for overseeing gene, oligonucleotide, and recombinant protein intercepts; commissioning blood pathology HPO eXam intercepts; and monitoring preliminary eXams with advanced imaging techniques.

DISCUSSION: In pursuit of excellence in PHM of HPO, HEMSS with Agile Group Development leverages the Genomic Newborn Screens (GNBS) and multi-omics to create personalised health plans integrated with NHS England Genomics and AI-driven DRS. The discourse extends to examining GNBS predictors and intercepts, focusing on their impact on public health and patient safety. Discussions encompass structured HPO knowledge addressing newborn health, ethical considerations, family privacy, and the benefits and limitations of pre-eXam screenings and life eXam intercepts. These debates involve stakeholders in adopting HPO-enhanced clinical pathways through Alliances for Health Systems Networking-Genomic Enterprise Partnerships (AHSN-GEP).

CONCLUSION: "The Generation Study" represents a paradigm in digital child health management using an HPO-X-Gen-AI framework, transitioning from trusted research to evidence-based discovery. This approach sets a standard for personalised healthcare practices, incorporating ontology risk stratification and future-ready analytics as outlined in the NHS Constitution. The discourse on higher expert medical science safety governance will continue in the forthcoming manuscript, "PHM Fit Lifecycles in Future Analytics," which will further explore developing localised health solutions for "Our Future Health."

RevDate: 2025-08-16

Qian Y, Zhou Z, Ouyang T, et al (2025)

Pangenome analysis of transposable element insertion polymorphisms reveals features underlying cold tolerance in rice.

Nature communications, 16(1):7634.

Transposable elements (TEs) introduce genetic and epigenetic variability, contributing to gene expression patterns that drive adaptive evolution in plants. Here, we investigate TE architecture and its effect on cold tolerance in rice. By analyzing a pangenome graph and the resequencing data of 165 rice accessions, we identify 30,316 transposable element insertion polymorphism (TIP) sites, highlighting significant diversity among polymorphic TEs (pTEs). We observe that pTEs exhibit increased H3K27me3 enrichment, suggesting a potential role in epigenetic differentiation under cold stress and in the transcriptional regulation of the cold response. We identify 26,914 TEs responsive to cold stress from transcriptome data, indicating their potential significance in regulatory networks for this response. Our TIP-GWAS analysis reveal two cold tolerance genes OsCACT and OsPTR. The biological functions of these genes are confirmed using knockout and overexpression lines. Our web tool (https://cbi.gxu.edu.cn/RICEPTEDB/) makes all pTEs available to researchers for further analysis. These findings provide valuable targets for breeding cold-tolerant rice varieties, indicating the potential importance of pTEs in crop enhancement.

RevDate: 2025-08-16

Li Y, Huang Z, Zhu X, et al (2025)

Serotyping, molecular typing, and vaccine protein screening for Riemerella anatipestifer: Overcoming challenges in prevention and treatment.

Veterinary microbiology, 309:110663 pii:S0378-1135(25)00298-6 [Epub ahead of print].

Riemerella anatipestifer (R. anatipestifer) affects the duck farming industry worldwide, causing substantial economic losses. The current disease prevention and treatment strategies primarily include vaccines and antibiotics. However, the large number of serotypes and increasing resistance to R. anatipestifer make it challenging to prevent and treat the infection. This study carried out the serotyping and molecular typing of 51 R. anatipestifer strains and predicted vaccine proteins based on pan-genome analysis and cross-immune protection potential. For serotype identification, the rabbits were immunized with antigens, and 9 serotyped sera were prepared, the data revealed 6 serotypes with two unformed strains. The results for the self-made serotypes were consistent with those obtained from the externally submitted strains. Moreover, the pan-genome analysis was performed on 51 R. anatipestifer strains, and an open pan-genome set of 5094 genes was constructed. In addition, the COG annotation classification indicated that the core and non-core genomes were significantly different in gene functions. A total of 1116 core genomes that could serve as better cross-protective vaccine proteins were analyzed and revealed 5 genes of interest. In addition, the oprM-1 protein, a highly reactive protein, was expressed and purified, and the immunoreactivity with five antisera (anti-serotypes 1, 2, 5, 11, and 18) was demonstrated by Western blotting. This study fills the gaps in the existing typing systems for R. anatipestifer by combining serotyping, MLST typing, and pan-genome analysis. Furthermore, it provides valuable insights into the epidemiology, evolution, and pathogenesis of R. anatipestifer and paves the way for developing effective cross-protective vaccines.

RevDate: 2025-08-15

Hatmaker EA, Barber AE, Drott MT, et al (2025)

Population structure in a fungal human pathogen is potentially linked to pathogenicity.

Nature communications, 16(1):7594.

Aspergillus flavus is a clinically and agriculturally important saprotrophic fungus responsible for severe human infections and extensive crop losses. Here, we analyze genomic data from 300 (117 clinical and 183 environmental) A. flavus isolates from 13 countries, including 82 clinical isolates sequenced in this study, to examine population and pan-genome structure and their relationship to pathogenicity. We use single nucleotide polymorphisms to build a phylogeny, analyze admixture, and perform discriminant analysis of principal components. We identify five A. flavus populations, including a new population, D, corresponding to distinct clades in the genome-wide phylogeny. Strikingly, > 75% of clinical isolates were in population D and <5% in population B. We also use orthogroup clustering to identify core and accessory genes within the pan-genome. Accessory genes, including genes within biosynthetic gene clusters, were significantly more common in some populations but rare in others. Our functional annotations show that population D is enriched for genes associated with carbohydrate metabolism, lipid metabolism and certain types of hydrolase activity, whereas a non-clinical population is depleted in genes related to zinc ion binding. In contrast to previous results from the major human pathogen Aspergillus fumigatus, isolation of A. flavus from human specimens is associated with population structure, providing a promising system for future investigations into the contributions of population-specific genetic differences to human infection.

RevDate: 2025-08-14

Teasdale LC, Murray KD, Collenberg M, et al (2025)

Pangenomic context reveals the extent of intraspecific plant NLR evolution.

Cell host & microbe, 33(8):1291-1305.e9.

Nucleotide-binding leucine-rich repeat (NLR) proteins are major components of the plant immune system, recognizing pathogen effectors and triggering defense responses. Because of the diversity of pathogen effector repertoires, NLRs have extraordinary sequence, structural, and regulatory variability. Although processes contributing to NLR diversity have been identified, the precise evolution of NLRs in their genomic context and along the multiple axes of diversity has been difficult to trace. We integrate genome-specific full-length transcript, homology, and transposable element information to annotate 3,789 NLRs in 17 diverse Arabidopsis thaliana accessions. We define 121 pangenomic NLR neighborhoods, which vary greatly in size, content, and complexity. NLRs are diverse across many axes, and multiple metrics are required to fully capture NLR variation. Based on these findings, we propose that diversity in diversity generation is fundamental to maintaining a functionally "adaptive" immune system in plants and that mechanistic studies should consider multiple axes of immune system diversity.

RevDate: 2025-08-15

Li M, Wu Y, Li H, et al (2025)

Comparative pan-genome analysis of Huperzia and Phlegmariurus and transcriptomics reveals thermal adaptation in Huperzia.

Functional & integrative genomics, 25(1):168.

Huperzia and Phlegmariurus are ancient genera within the Lycopodiaceae family with significant medicinal value and ecological adaptability, yet the evolutionary dynamics and genetic diversity of their chloroplast genomes remain poorly characterized. Specifically, critical aspects such as intergeneric differences, phylogenetic relationships, and adaptive evolution within their chloroplast genomes remain insufficiently explored. This study analyzed the chloroplast genomes of 66 species from these two genera through comparative genomics to elucidate their structural dynamics and adaptive mechanisms. Results revealed that Huperzia chloroplast genomes (153-155 kb, GC content 36.25-36.39%) are significantly larger than those of Phlegmariurus (148-151 kb, GC content 33.78-34.26%), with pronounced differences in IR boundary dynamics, repetitive sequence distribution, nucleotide diversity, and codon usage bias. Phylogenetic and population structure analyses confirmed the monophyly of both genera and demonstrated significantly higher genetic diversity in Phlegmariurus, likely linked to adaptive radiation driven by humid tropical environments. Transcriptomic data revealed a temporally coordinated chloroplast response to heat stress in Huperzia serrata. Photosynthetic core genes (such as psaB and rrna16) were downregulated, leading to sustained functional impairment. In contrast, early stress-response genes (such as rbcL and trnI-GAU) peaked at 4 h to enhance carbon fixation and transport. Mid-phase repair genes (such as ndhG and rps8) exhibited inverted U-shaped expression patterns to activate electron transport and protein synthesis, whereas late-stage overexpression of atpI restored energy homeostasis. This coordinated regulatory mechanism illustrates a survival strategy of "photosynthetic inhibition-stress compensation-energy reorganization" for thermal adaptation. Future studies should integrate nuclear genome and epigenetic modification data to further unravel the synergistic nucleo-cytoplasmic interactions underlying environmental adaptation.

RevDate: 2025-08-16

Ahmad B, Su Y, Hao Y, et al (2025)

Mango pangenome reveals dramatic impacts of reference bias on population genomic analyses.

Horticulture research, 12(9):uhaf166.

Most genomic studies start by mapping sequencing data to a reference genome. The quality of reference genome assembly, genetic relatedness to the studied population, and the mapping method employed directly impact variant calling accuracy and subsequent genomic analyses, introducing reference bias and resulting in erroneous conclusions. However, the impacts of reference bias have gained limited attention. This study compared population genomic analyses using four different reference genomes of mango (Mangifera indica), including the two haploid assemblies of haplotype-resolved telomere-to-telomere (T2T) genome assembly, a pangenome, and an older version of the reference genome available on NCBI. The choice of reference genome dramatically impacted the mapping efficiency and resulted in notable differences in calling the genetic variants, particularly structural variations (SVs). Phylogenetic analysis was more sensitive to the reference genome compared to genetic differentiation. Population genomic analyses of artificial selection in domestication and SV hotspot regions varied across reference genomes. Notably, the gene enrichment analyses showed significant differences in the top enriched biological processes depending on the reference genome used. Overall, the mango pangenome outperformed the other reference genomes across various metrics, followed by T2T reference genomes, as they captured greater diversity and effectively reduced reference bias. Our findings highlight the role of the mango pangenome in reducing reference bias and underscore the critical role of reference genome selection, suggesting that it is one of the most important factors in population genomic studies.

RevDate: 2025-08-13

Li H (2025)

Finding easy regions for short-read variant calling from pangenome data.

ArXiv pii:2507.03718.

BACKGROUND: While benchmarks on short-read variant calling suggest low error rate below 0.5%, they are only applicable to predefined confident regions. For a human sample without such regions, the error rate could be 10 times higher. Although multiple sets of easy regions have been identified to alleviate the issue, they fail to consider non-reference samples or are biased towards existing short-read data or aligners.

RESULTS: Here, using hundreds of high-quality human assemblies, we derived a set of sample-agnostic easy regions where short-read variant calling reaches high accuracy. These regions cover 88.2% of GRCh38, 92.2% of coding regions and 96.3% of ClinVar pathogenic variants. They achieve a good balance between coverage and easiness and can be generated for other human assemblies or species with multiple well assembled genomes.

CONCLUSION: This resource provides a convient and powerful way to filter spurious variant calls for clinical or research human samples.

RevDate: 2025-08-13

Salehi Nowbandegani P, Zhang S, Hu H, et al (2025)

Defining and cataloging variants in pangenome graphs.

bioRxiv : the preprint server for biology pii:2025.08.04.668502.

Structural variation causes some human haplotypes to align poorly with the linear reference genome, leading to 'reference bias'. A pangenome reference graph could ameliorate this bias by relating a sample to multiple reference assemblies. However, this approach requires a new definition of a 'genetic variant.' We introduce a definition of pangenome variants and a method, pantree , to identify them. Our approach involves a pangenome reference tree which includes all nodes (sequences) of the pangenome graph, but only a subset of its edges; non-reference edges are variant edges . Our variants are biallelic and have well-defined positions. Analyzing the Minigraph-Cactus draft human pangenome reference graph, we identified 29.6 million genetic variants. Most variants (99.2%) are small, and most small variants (73.9%) are SNPs. 3.5 million variants (11.7%) have a reference allele which is not on GRCh38; these variants are difficult to detect without a pangenome reference, or with existing pangenome-based approaches. They tend to be embedded within tangled, multiallelic regions. We analyze two medically relevant regions, around the HLA-A and RHD genes, identifying thousands of small variants embedded within several large insertions, deletions, and inversions. We release an open-source software tool together with a VCF variant catalogue.

RevDate: 2025-08-16

Depuydt L, Renders L, Van de Vyver S, et al (2025)

b-move: faster lossless approximate pattern matching in a run-length compressed index.

Algorithms for molecular biology : AMB, 20(1):15.

BACKGROUND: Due to the increasing availability of high-quality genome sequences, pan-genomes are gradually replacing single consensus reference genomes in many bioinformatics pipelines to better capture genetic diversity. Traditional bioinformatics tools using the FM-index face memory limitations with such large genome collections. Recent advancements in run-length compressed indices like Gagie et al.'s r-index and Nishimoto and Tabei's move structure, alleviate memory constraints but focus primarily on backward search for MEM-finding. Arakawa et al.'s br-index initiates complete approximate pattern matching using bidirectional search in run-length compressed space, but with significant computational overhead due to complex memory access patterns.

RESULTS: We introduce b-move, a novel bidirectional extension of the move structure, enabling fast, cache-efficient, lossless approximate pattern matching in run-length compressed space. It achieves bidirectional character extensions up to 7 times faster than the br-index, closing the performance gap with FM-index-based alternatives. For locating occurrences, b-move performs ϕ and ϕ - 1 operations up to 7 times faster than the br-index. At the same time, it maintains the favorable memory characteristics of the br-index, for example, all available complete E. coli genomes on NCBI's RefSeq collection can be compiled into a b-move index that fits into the RAM of a typical laptop.

CONCLUSIONS: b-move proves practical and scalable for pan-genome indexing and querying. We provide a C++ implementation of b-move, supporting efficient lossless approximate pattern matching including locate functionality, available at https://github.com/biointec/b-move under the AGPL-3.0 license.

RevDate: 2025-08-16

Kulmanov M, Ashouri S, Liu Y, et al (2025)

Phased genome assemblies and pangenome graphs of human populations of Japan and Saudi Arabia.

Scientific data, 12(1):1316.

The selection of a reference sequence in genome analysis is critical, as it serves as the foundation for all downstream analyses. Recently, the pangenome graph has been proposed as a data model that incorporates haplotypes from multiple individuals. Here we present JaSaPaGe, a pangenome graph reference for Saudi Arabian and Japanese populations, both of which have been significantly underrepresented in previous genomic studies. We constructed JaSaPaGe from high-quality phased diploid assemblies which were made utilizing PacBio high-fidelity long reads, Nanopore long reads, and Hi-C short reads of 9 Saudi and 10 Japanese individuals. Quality evaluation of the pangenome graph by variant calling showed that our pangenome outperformed earlier linear reference genomes (GRCh38 and T2T-CHM13) and showed comparable performance to the pangenome graph provided by the Human Pangenome Reference Consortium (HPRC), with more variants found in Japanese and Saudi samples using their population-specific pangenomes. This pangenome reference will serve as a valuable resource for both the research and clinical communities in Japan and Saudi Arabia.

RevDate: 2025-08-12

Qing Y, Liao Z, An D, et al (2025)

Comparative genomics reveals the genetic diversity and plasticity of Clostridium tertium.

Journal of applied microbiology pii:8232670 [Epub ahead of print].

AIMS: Clostridium tertium, increasingly recognized as the emerging human pathogen frequently isolated from environmental and clinical specimens, remains genetically underexplored despite its clinical relevance. This study aims to explore the genetic characteristics of C. tertium by genomic analysis.

METHODS AND RESULTS: This study presented a comprehensive genomic investigation of 45 C. tertium strains from the GenBank database. Genome sizes (3.27-4.55 Mbp) and coding gene counts varied markedly across strains. Phylogenetic analyses based on 16S rRNA gene and core genome uncovered distinct intra-species lineages, including evolutionarily divergent clusters likely shaped by niche specialization. Pan-genomic analysis confirmed an open genome, with accessory and strain-specific genes enriched in functions related to environmental adaptation and regulation. Functional annotation further identified diverse virulence factor genes (e.g. clpP, nagK) and antibiotic resistance genes (e.g. vatB, tetA(P)) co-occurring with mobile genetic elements (MGEs), suggesting that horizontal gene transfer (HGT) may be a key driver of genome plasticity in C. tertium. Notably, one-third of the strains carried CRISPR-Cas systems, indicating the defense potential against exogenous genetic elements.

CONCLUSIONS: C. tertium exhibited extensive genetic diversity and genome plasticity, probably driven by MGE-mediated HGT, defense mechanisms of CRISPR-Cas systems, and functional adaptation related to virulence and resistance. These traits may underlie its ability to colonize diverse environments and acquire pathogenicity and resistance.

RevDate: 2025-08-14
CmpDate: 2025-08-12

Anderson OH, Chong JPJ, GH Thomas (2025)

Comparative genomics of Clostridium butyricum reveals a conserved genome architecture and novel virulence-related gene clusters.

Microbial genomics, 11(8):.

Bacteria from the species Clostridium butyricum encompass a diverse range of phenotypes. While some strains are used as probiotics, others have been isolated from cases of botulism and necrotizing enterocolitis (NEC) in preterm neonates. We identify a unique genomic feature of this species, namely a highly conserved extrachromosomal element of ~0.8 Mb. This replicon satisfies the three principal criteria used to define a chromid, which include the possession of core genes that are encoded on the main chromosome in other species. Although C. butyricum is the type species of Clostridium, we find that the possession of a chromid is not a typical feature of members of this genus and represents a unique genomic fingerprint of the species C. butyricum. Furthermore, we show that pathogenic C. butyricum strains from the sequenced examples are not monophyletic, which suggests that virulence has evolved multiple times from related non-pathogenic ancestors. However, we were able to identify common genes which are found exclusively in these pathogenic strains. In addition to the botulinum neurotoxin genes, these include a novel set of genes involved in the biosynthesis of a capsular polysaccharide (CPS), and genes that confer the ability to utilize the mucin-derived sugar l-fucose, which may provide a competitive advantage for growth in the colon. Moreover, by identifying NEC strain-associated virulence factors, we are able to further the understanding of these particularly harmful strains.

RevDate: 2025-08-16
CmpDate: 2025-08-12

Zou Y, Zhu W, Hou Y, et al (2025)

The evolutionary dynamics of organellar pan-genomes in Arabidopsis thaliana.

Genome biology, 26(1):240.

BACKGROUND: In plants, comparative analyses of organellar genomes are often based on draft assemblies. Large-scale investigations into the complex structural rearrangements of mitochondrial genomes remain scarce.

RESULTS: Here, we perform a comprehensive analysis of the dominant conformations and dynamic heteroplasmic variants of organellar genomes in the model plant Arabidopsis thaliana, utilizing high-quality long-read assemblies validated at high resolution from 149 samples. We find that mitochondrial and plastid genomes share common types of structural and small-scale variants driven by similar DNA sequence features. However, rearrangements mediated by repetitive sequences in mitochondrial genomes evolve so rapidly that they are often decoupled from other types of variants. Rare complex events involving elongation and fusion of existing repeats are also observed, contributing to the unalignable regions commonly found at the interspecies level. Additionally, we demonstrate that disrupting and rescuing organellar DNA maintenance could drive the rapid evolution of dominant mitochondrial genome conformations.

CONCLUSIONS: Our study provides an unprecedentedly detailed view of the dynamics of organellar genomes at pan-genome scale in Arabidopsis thaliana, paving the way to unlock the full potential of organellar genetic resources.

RevDate: 2025-08-11

Engelhorn J, Snodgrass SJ, Kok A, et al (2025)

Genetic variation at transcription factor binding sites largely explains phenotypic heritability in maize.

Nature genetics [Epub ahead of print].

Comprehensive maps of functional variation at transcription factor (TF) binding sites (cis-elements) are crucial for elucidating how genotype shapes phenotype. Here, we report the construction of a pan-cistrome of the maize leaf under well-watered and drought conditions. We quantified haplotype-specific TF footprints across a pan-genome of 25 maize hybrids and mapped over 200,000 variants, genetic, epigenetic, or both (termed binding quantitative trait loci (bQTL)), linked to cis-element occupancy. Three lines of evidence support the functional significance of bQTL: (1) coincidence with causative loci that regulate traits, including vgt1, ZmTRE1 and the MITE transposon near ZmNAC111 under drought; (2) bQTL allelic bias is shared between inbred parents and matches chromatin immunoprecipitation sequencing results; and (3) partitioning genetic variation across genomic regions demonstrates that bQTL capture the majority of heritable trait variation across ~72% of 143 phenotypes. Our study provides an auspicious approach to make functional cis-variation accessible at scale for genetic studies and targeted engineering of complex traits.

RevDate: 2025-08-11
CmpDate: 2025-08-11

Tellatin D, Cornet L, Snauwaert V, et al (2025)

Melissospora conviva gen. nov., sp. nov., a novel actinobacterial genus isolated from beehive through cross-feeding interactions.

International journal of systematic and evolutionary microbiology, 75(8):.

Most micro-organisms remain unculturable under standard laboratory conditions, limiting our understanding of microbial diversity and ecological interactions. One major cause of this uncultivability is the loss of access to essential cross-fed metabolites when bacteria are removed from their natural communities. During a bioprospecting campaign targeting actinomycetes of an Apis mellifera beehive, we identified five isolates (DT32, DT45[T], DT55, DT59 and DT194) that required co-cultivation for growth recovery, suggesting a dependence on microbial interactions in their native habitat. Whole-genome sequencing and phylogenetic analysis positioned these isolates within a distinct lineage of Micromonosporaceae, separate from the five officially recognized clades of the Micromonospora genus. A combination of microscopic, chemotaxonomic and physiological characterizations further supported their uniqueness. Notably, they exhibited high auxotrophy, being unable to use all carbon sources tested, likely due to genome reduction (4.6 Mbp) compared to other Micromonosporaceae. Pangenomic comparisons with their closest Micromonospora relatives revealed gene losses in key metabolic pathways, including the glyoxylate bypass and the Entner-Doudoroff pathway, which may explain their metabolic reliance. These findings reveal a highly specialized, ecologically adapted lineage with deep evolutionary divergence and further support microbial interdependence isolation strategies to explore the microbial dark matter. We propose Melissospora conviva as a novel genus and species within the Actinomycetota phylum, with isolate DT45[T] as the representative type species and type strain, which has been deposited in public collections under the accession numbers DSM 117791 and LMG 33580.

RevDate: 2025-08-10

Yerka MK, Liu Z, Bean S, et al (2025)

An updated molecular toolkit for genomics-assisted breeding of waxy sorghum [Sorghum bicolor (L.) Moench].

Journal of applied genetics [Epub ahead of print].

Several mutations of the sorghum [Sorghum bicolor (L.) Moench] GRANULE-BOUND STARCH SYNTHASE (GBSS) gene [Sobic.010G022600; commonly known as Waxy (Wx)] result in a low amylose:amylopectin starch ratio. Recessive waxy (wx) alleles improve starch digestibility in ethanol production, human foods and beverages, and animal feed. However, breeding waxy sorghum is challenging due to reliance on traditional PCR markers for genotyping, which are not amenable to next-generation sequencing (NGS). Most commercial breeding programs use high-throughput genotyping and genomic selection in large, segregating populations prior to flowering. This study provides the first published NGS markers for the two most commonly used waxy (wx) alleles of sorghum and is the first to fully sequence the large insertion that is causal of the wx[a] allele. In the absence of a pangenome including wx[a] genotypes, we constructed an in silico B.Tx623 wx[a] genome assembly from the B.Tx623 reference genome (v3.1.1) including the insertion, a ~ 5-kb-long terminal repeat (LTR) retrotransposon of the copia superfamily. The in silico wx[a] assembly improved read mapping at Sobic.010G022600 in wx[a] individuals, identified 78 new uniquely mapped reads, and made it possible to distinguish different Waxy genotypes using short-read sequencing data. Functional PACE-PCR markers, suitable for marker-assisted selection and multiplexed, low-to-mid-density genomic selection, were developed for Wx, wx[a], and wx[b] alleles. The PACE markers were validated in segregating populations of three public and private breeding programs. These new molecular breeding resources comprise a toolkit that will improve the efficiency of developing commercial waxy sorghum hybrids using genomics-assisted approaches.

RevDate: 2025-08-13

Weers T, Feng Y, RJ Peters (2025)

Uncrossing the 'X': Characterization of alternative alleles for KSLX in Oryza.

Phytochemistry, 240:114634 pii:S0031-9422(25)00257-2 [Epub ahead of print].

The widely cultivated Asian rice (Oryza sativa) produces a variety of physiologically relevant diterpenoid products, which range in effect from the phytohormone gibberellin, derived from ent-kaurene, to phytoalexins such as the momilactones, derived from syn-pimara-7,15-diene. Previous reports have shown functional variation in the kaurene synthase-like (KSL) genes responsible for synthesizing diterpene precursors to more specialized metabolites, leading to the creation of distinct diterpenoids from allelomorphic genes. Here is reported the product of two previously discovered but uncharacterized alleles of the unusual KSLX, representing a cross between (fusion of) the tandem pair KSL8-KSL9p found in most cultivars. The previously characterized allele (KSLXo) was reported to act on syn-copalyl pyrophosphate (syn-CPP) to produce syn-abieta-7,12-diene, precursor to the phytoalexin oryzalactone. However, at least one other functionally distinct allele was reported from the O. sativa pan-genome (KSLXn), along with another phylogenetically distinct allele found in Oryza barthii (KSLXb), but these were not further characterized. Here both KSLXn and KSLXb were found to selectively react with syn-CPP and produce syn-pimara-9(11),15-diene, a novel diterpene in rice. Additionally, evolution of this locus was investigated, with KSLXb hypothesized to be a functional KSL9. The striking complexity of this locus, which includes distinct composition (KSL8-KSL9(p) or KSLX) as well as allelomorphism of both KSL8 and KSLX, suggests it is subject to balancing selection, consistent with the competing pressures exerted on phytoalexin biosynthesis. Regardless, the studies reported here clarify this additional example of allelomorphic variation in the rice KSL family, providing insight into the rice pan-genomic diterpenoid arsenal.

RevDate: 2025-08-09

Gross JE, Fullmer J, McCleland G, et al (2025)

Genomic and epidemiologic investigation of Mycobacterium abscessus isolates in a cystic fibrosis center to determine potential routes of transmission.

Journal of cystic fibrosis : official journal of the European Cystic Fibrosis Society pii:S1569-1993(25)01526-7 [Epub ahead of print].

BACKGROUND: Cystic Fibrosis (CF) Centers worldwide have reported healthcare-associated outbreaks of nontuberculous mycobacteria (NTM). We report a retrospective investigation of shared Mycobacterium abscessus strains among people with cystic fibrosis (pwCF) receiving care at Dell Children's/Ascension combined Pediatric and Adult CF Program (DCMC).

METHODS: Whole genome sequencing (WGS) was used to identify genetically similar isolates among 167 NTM isolates from 57 pwCF. Epidemiological investigation, respiratory and environmental isolate comparisons, and watershed mapping were performed.

RESULTS: WGS analysis revealed four M. abscessus clusters, two ssp. abscessus and two ssp. massiliense. One subject was infected with two distinct clustered M. abscessus (ssp. abscessus and ssp. massiliense). Epidemiologic investigation demonstrated opportunities for healthcare-associated transmission within all clusters. Two ssp. massiliense subject pairs had healthcare overlaps and high genomic relatedness, including one cohabitating sibling pair. M. abscessus recovered from DCMC revealed genetic similarity to a respiratory isolate from one patient who was never exposed to the hospital environment.

CONCLUSIONS: We identified shared M. abscessus strains via genomic analysis among pwCF at DCMC. None of the clustered patient isolates matched hospital environmental isolates at the genomic level. One hospital environmental isolate had genomic similarity to a respiratory isolate of M. abscessus, but the epidemiologic investigation revealed no evidence of subject exposure to the hospital setting. One ssp. massiliense subject pair had the same level of pangenome relatedness as the sibling pair and epidemiological investigation revealed overlap in the clinic, supporting healthcare-associated person-to-person transmission among the pair within a cluster. One pwCF had polyclonal clustered infections, suggesting multiple environmental sources of acquisition outside the healthcare environment.

RevDate: 2025-08-12
CmpDate: 2025-08-09

Wu H, Chen S, Wang J, et al (2025)

Pangenome analysis of Liriodendron reveals presence/absence variations associated with growth traits.

BMC plant biology, 25(1):1039.

Beyond single nucleotide polymorphisms (SNPs), gene presence/absence variation (PAV) plays a crucial role in elucidating species' genetic diversity, uncovering the genetic basis of key traits, and advancing molecular marker-assisted breeding in plants. In this study, we constructed a pangenome of Liriodendron based on 24 accessions. Comparative analysis with the reference genome revealed 116 Mb of non-reference sequences and obtained 32,773 genes, including 3,558 novel genes. We subsequently employed resequencing data from 247 Liriodendron genotypes to identify PAVs, comprising 13,779 core genes and 18,179 dispensable genes. To further assess PAV applicability, a genome-wide association study (GWAS) was conducted to link gene PAVs with growth traits in hybrid Liriodendron, and identified 14 candidate genes associated with these growth traits above. Additionally, gene PAVs appeared to predominantly contribute to heterosis in growth traits, displaying a dominant expression pattern when comparing leaf, shoot, and phloem tissues of strong and weak heterotic combinations. Additionally, two key candidate genes, Litul.02G164100 and Litul.01G057400, exhibit high parental expression patterns consistent with hybrid vigor in strong heterotic combinations of leaf and shoot tissues. Altogether, this study expands the Liriodendron genomic dataset, identifies candidate genes linked to growth traits, and provides insights into their heterotic mechanisms in hybrid Liriodendron.

RevDate: 2025-08-08
CmpDate: 2025-08-09

Rauniyar S, Samanta D, Thakur P, et al (2025)

Mapping the pangenome of sulfate reducing bacteria: core genes, plasticity, and novel functions in Desulfovibrio spp.

World journal of microbiology & biotechnology, 41(8):305.

The pangenome of sulfate reducing bacteria represents a genetic reservoir that deciphers the intricate interplay of conserved and variable elements driving their ecological dominance, evolutionary adaptability, and industrial relevance. This study introduces the most comprehensive pangenome analysis of the genus Desulfovibrio till date, incorporating 63 complete and high-quality genomes using the Partitioned Pangenome Graph of Linked Neighbors (PPanGGOLiN) pipeline. The structure and dynamics of core gene families were investigated through gene ontology, KEGG pathway mapping, and gene network analyses, shedding light on the functional organization of the Desulfovibrio genomes. The analysis categorized 799, 4053, and 43,581 gene families into persistent, shell, and cloud groups, respectively. A core set of 326 gene families, conserved across Desulfovibrio genomes, highlights their essential role in community functionality. Genome plasticity analysis identified 4,576 regions of genome plasticity, with 1,322 hotspots enriched in horizontally acquired genes (89% in the cloud partition). Key gene families in these regions included glpE, fdhD, petC, and cooF, linked to sulfur metabolism. Out of 29 hypothetical genes, one was linked to actin nucleation, another contained a TRASH domain, while the other regulates filopodium assembly. Other predicted functions included lnrL, folE, RNA binding, and pyrG/pyrH involvement in CTP biosynthesis. Additionally, genomic islands revealed evolutionary events, such as cheY acquisition in Oleidesulfovibrio alaskensis G20. This study provides a genus-wide view of Desulfovibrio, emphasizing genome plasticity, hypothetical gene functions, and adaptation mechanisms.

RevDate: 2025-08-08

Sui Y, Lin J, Noyes MD, et al (2025)

Pangenome discovery of missing autism variants.

medRxiv : the preprint server for health sciences pii:2025.07.21.25331932.

Autism spectrum disorders (ASDs) are genetically and phenotypically heterogeneous and the majority of cases still remain genetically unresolved. To better understand large-effect pathogenic variation, we generated long-read sequencing data to construct phased and near-complete genome assemblies (average contig N50=43 Mbp, QV=56) for 189 individuals from 51 families with unsolved cases of autism. We applied read- and assembly-based strategies to facilitate comprehensive characterization of de novo mutations (DNMs), structural variants (SVs), and DNA methylation profiles. Merging common SVs obtained from long-read pangenome controls, we efficiently filtered >97% of common SVs exclusive to 87 offspring. We find no evidence of increased autosomal SV burden for probands when compared to unaffected siblings yet note a trend for an increase of SV burden on the X chromosome among affected females. We establish a workflow to prioritize potential pathogenic variants by integrating autism risk genes and putative noncoding regulatory elements defined from ATAC-seq and CUT&Tag data from the developing cortex. In total, we identified three pathogenic variants in TBL1XR1 , MECP2 , and SYNGAP1 , as well as nine candidate de novo and biparental homozygous SVs, most of which were missed by short-read sequencing. Our work highlights the potential of phased genomes to discover complex more pathogenic mutations and the power of the pangenome to restrict the focus on an increasingly smaller number of SVs for clinical evaluation.

RevDate: 2025-08-12
CmpDate: 2025-08-08

Rohde A, Albertsen MC, Boden SA, et al (2025)

New genomic resources to boost research in reproductive biology to enable cost-effective hybrid seed production.

The plant genome, 18(3):e70092.

The commercial realization of hybrid wheat (Triticum aestivum L.) is a major technological challenge to sustainably increase food production for our growing population in a changing climate. Despite recent advances in cytoplasmic- and nuclear-based pollination control systems, the inefficient outcrossing of wheat's autogamous florets remains a barrier to hybrid seed production. There is a pressing need to investigate wheat floral biology and enhance the likelihood of ovaries being fertilized by airborne pollen so breeders can select and utilize male and female parents for resilient, scalable, and cost-effective hybrid seed production. Advances in understanding the wheat genomes and pangenome will aid research into the underlying floral organ development and fertility with the aim to stabilize pollination and fertilization under a changing climate. The purpose of this position paper is to highlight priority areas of research to support hybrid wheat development, including (1) structural aspects of florets that affect stigma presentation, longevity, and receptivity to airborne pollen, (2) pollen release dynamics (e.g., anther extrusion and dehiscence), and (3) the effect of heat, drought, irradiation, and humidity on these reproductive traits. A combined approach of increased understanding built on the genomic resources and advanced trait evaluation will deliver to robust measures for key floral characteristics, such that diverse germplasm can be fully exploited to realize the yield improvements and yield stability offered by hybrids.

RevDate: 2025-08-11
CmpDate: 2025-08-11

Tabashiri R, Mahmoodian S, Pakdel MH, et al (2025)

Comprehensive in vitro and whole-genome characterization of probiotic properties in Pediococcus acidilactici P10 isolated from Iranian broiler chicken.

Scientific reports, 15(1):28953.

This study presents a comprehensive characterization of Pediococcus acidilactici strain P10, a novel probiotic isolated from native broiler chickens, integrating in vitro analyses with whole-genome sequencing. P10 demonstrates promising probiotic attributes, supported by both phenotypic and genomic evidence. The strain was non-hemolytic and exhibited high survival rates under simulated gastrointestinal conditions (95-99% in acidic pH, 55% in bile salts), with genomic analysis confirming the presence of associated stress resistance genes. Importantly, P10 displayed potent broad-spectrum antimicrobial activity against key pathogens, underpinned by the identification of multiple putative bacteriocin-encoding genes. Furthermore, the strain showed strong adherence to intestinal epithelial cells, with corresponding adhesion genes identified in its genome. Beyond these phenotypic-genotypic correlations, P10's whole-genome sequencing revealed significant novel findings. The 1.84 Mb genome confirmed P10 as P. acidilactici and, most notably, identified a complete, functional Type II-A CRISPR-Cas system. This system, with 17 phage-matching spacers, represents a robust antiviral defense mechanism, a key and distinct feature for probiotic application. Additionally, pan-genomic analysis highlighted 59 genes unique to P10 not found in other P. acidilactici strains, suggesting novel metabolic and adaptive capabilities previously uncharacterized within the species. In summary, Pediococcus acidilactici strain P10 is a highly promising probiotic, combining confirmed resilience and antimicrobial action with unique genomic advantages such as its specialized CRISPR-Cas system and novel genetic elements, making it a valuable candidate for applications in animal health and functional foods.

RevDate: 2025-08-07

Du ZZ, He JB, Xiao PX, et al (2025)

Varigraph: an accurate and widely applicable pangenome graph-based variant genotyper for diploid and polyploid genomes.

Molecular plant pii:S1674-2052(25)00267-9 [Epub ahead of print].

Accurate variant genotyping is crucial for genomics-assisted breeding. Graph pangenome references can address single-reference bias, thereby enhancing the performance of variant genotyping and empowering downstream applications in population genetics and quantitative genetics. However, existing pangenome-based genotyping methods struggle with large or complex pangenome graphs, particularly in polyploid genomes. Here, we introduce Varigraph, an algorithm that leverages the comparison of unique and repetitive k-mers between variant sites and short reads for genotyping both small and large variants. We evaluated Varigraph on a diverse set of representative plant genomes as well as human genomes. Varigraph outperforms current state-of-the-art linear and graph-based genotypers across non-human genomes while maintaining comparable genotyping performance in human genomes. By employing efficient data structures including Counting Bloom Filter and bitmap storage, as well as GPU models, Varigraph achieves improved precision and robustness in repetitive regions while managing computational costs for large datasets. Its wide applicability extends to highly repetitive or large genomes, such as those of maize and wheat. Significantly, Varigraph can handle extensive pangenome graphs, as demonstrated by its performance on a dataset containing 252 rice genomes, where it achieved a precision exceeding 0.9 for both small and large variants. Notably, Varigraph is capable of effectively utilizing pangenome graphs for genotyping autopolyploids, enabling precise determination of allele dosage. This work provides a robust and accurate solution for genotyping plant genomes and will advance plant genomic studies and genomics-assisted breeding.

RevDate: 2025-08-09
CmpDate: 2025-08-06

Cao S, Sawettalake N, L Shen (2025)

Lactuca super-pangenome provides insights into lettuce genome evolution and domestication.

Nature communications, 16(1):7257.

Lettuce (Lactuca sativa L.) is among the most widely cultivated and consumed vegetables globally, valued for its phytonutrients beneficial for human health. Here, we report the high-quality reference super-pangenome for the Lactuca genus by integrating 12 chromosome-scale genomes from representative cultivated lettuce morphotypes, landrace, and wild relatives, and investigate lettuce genome evolution and domestication. These assemblies exhibit diverse genome sizes ranging from 2.1 Gb to 5.5 Gb with abundant repetitive sequences, and expansion of repetitive sequences, associated with low DNA methylation levels, likely contributes to the genome size variations. Furthermore, by constructing a graph-based lettuce pangenome reference, we explore the landscape, genomic, and epigenetic features of structural variations, and reveal their contributions to gene expression and domestication. We also identify copy number variations in FLOWERING LOCUS C in association with delayed flowering in cultivated lettuce. Overall, this comprehensive Lactuca super-pangenome assembly will expedite functional genomics studies and breeding efforts of this globally important crop.

RevDate: 2025-08-08

Elakkya M, González-Salazar LA, López-Reyes K, et al (2025)

Comparative genomics and metabolomics reveal phytohormone production, nutrient acquisition, and osmotic stress tolerance in Azotobacter chroococcum W5.

Frontiers in microbiology, 16:1626016.

INTRODUCTION: Concerns about ecological degradation and reduced biodiversity have intensified the search for sustainable solutions in agriculture. The use of plant growth-promoting bacteria (PGPB) offers a promising alternative to enhance soil quality and crop yield while reducing the consumption of chemical fertilizers.

METHODS: Here, we characterize the plant growth-promoting potential of Azotobacter chroococcum W5 through comparative genomics, in vitro experiments, and metabolomic analyses.

RESULTS: Comparative genomic analysis revealed plant growth-promoting traits, including phytohormone biosynthesis, nutrient acquisition, stress adaptation, and colonization in the A. chroococcum W5 strain. Experimental assays confirmed the production of auxin, gibberellic acid, phosphate solubilization, moderate nitrogen fixation, and growth on ACC. Wheat seed inoculation significantly enhanced germination metrics, seedling vigor, and altered carbohydrate metabolism in the seed endosperm. Under salt and osmotic stress, A. chroococcum W5 metabolomic profiling revealed adaptive responses, including elevated levels of osmoprotectants (proline, glycerol) and oxidative stress markers such as 2-hydroxyglutarate, while putrescine and glycine decreased.

DISCUSSION: Our results show that the A. chroococcum W5 strain has great potential for the development of novel formulations. More importantly, our results highlight the potential of using plant growth-promoting microorganisms for innovative, sustainable solutions in agriculture.

RevDate: 2025-08-08
CmpDate: 2025-08-06

Stevens-Green R, Chénard C, Mordret S, et al (2025)

Organellar Genomes of Three Globally Important Nanoplanktonic Diatoms Refine Their Taxon-Specific Distribution and Succession Patterns in the Northwest Atlantic.

The Journal of eukaryotic microbiology, 72(5):e70033.

Nanoplanktonic diatoms (2-20 μm) are a significant yet historically understudied component of marine ecosystems. We investigated three recently isolated nanoplanktonic diatoms from the Northwest Atlantic Ocean (NWA): Minidiscus spinulatus, Mediolabrus comicus, and Minidiscus trioculatus. Using Oxford Nanopore sequencing, we assembled and annotated their complete chloroplast and mitochondrial genomes. Pangenome analyses revealed that Minidiscus species consistently clustered more closely with select Thalassiosira species, whereas M. comicus formed a sister clade with Skeletonema. Circularized chloroplast genomes allowed us to characterize the full-length 16S ribosomal RNAs for each isolate, thereby leading to higher resolution of these taxa in preexisting 16S metabarcoding data. During our study, M. spinulatus was primarily restricted to the Bedford Basin. In contrast, both M. trioculatus and M. comicus had larger geographic ranges extending to the Labrador Sea, and in the case of M. comicus, to the Canadian Arctic Gateway. Weekly metabarcoding from the coastal Bedford Basin, N.S., Canada (2014-2022), revealed a seasonal succession of nanoplanktonic taxa, with Minidiscus trioculatus dominating in the early months, followed by M. comicus and M. spinulatus. Our results highlight the critical value of phytoplankton isolations and organelle genomics for expanding our understanding of the diversity and biogeography of nanoplanktonic diatoms.

RevDate: 2025-08-05
CmpDate: 2025-08-05

Wang H, Nielsen J, Zhou YJ, et al (2025)

Yeast adapts to diverse ecological niches driven by genomics and metabolic reprogramming.

Proceedings of the National Academy of Sciences of the United States of America, 122(32):e2502044122.

The famous model organism Saccharomyces cerevisiae is widely present in a variety of natural and human-associated habitats. Despite extensive studies of this organism, the metabolic mechanisms driving its adaptation to varying niches remain elusive. We here gathered genomic resources from 1,807 S. cerevisiae strains and assembled them into a high-quality pangenome, facilitating the comprehensive characterization of genetic diversity across isolates. Utilizing the pangenome, 1,807 strain-specific genome-scale metabolic models (ssGEMs) were generated, which performed well in quantitative predictions of cellular phenotypes, thus helping to examine the metabolic disparities among all S. cerevisiae strains. Integrative analyses of fluxomics and transcriptomics with ssGEMs showcased ubiquitous transcriptional regulation of metabolic flux in specific pathways (i.e., amino acid synthesis) at a population level. Additionally, the gene/reaction inactivation analysis through the ssGEMs refined by transcriptomics showed that S. cerevisiae strains from various ecological niches had undergone reductive evolution at both the genomic and metabolic network levels when compared to wild isolates. Finally, the compiled analysis of the pangenome, transcriptome, and metabolic fluxome revealed remarkable metabolic differences among S. cerevisiae strains originating from distinct oxygen-limited niches, including human gut and cheese environments, and identified convergent metabolic evolution, such as downregulation of oxidative phosphorylation pathways. Together, these results illustrate how yeast adapts to distinct niches modulated by genomic and metabolic reprogramming, and provide computational resources for translating yeast genotype to fitness in future studies.

RevDate: 2025-08-05

Liu Y, Wang X, Wu L, et al (2025)

Prediction of antimicrobial resistance in Staphylococcus aureus with a machine learning classifier based on WGS data.

Microbiology spectrum [Epub ahead of print].

UNLABELLED: The phenomenon of antimicrobial resistance (AMR) often results in treatment failure and restrictions on precision medicine, emphasizing the need for molecular diagnosis of drug resistance. The current use of machine learning (ML) techniques based on whole genome sequencing (WGS) data offers a more precise prediction of phenotypes. We incorporated WGS data from 3979 Staphylococcus aureus strains in our study. We modeled 10 common antibiotics using three types of features: gene, single nucleotide polymorphism (SNP), and k-mer to identify the best model and to determine which feature values most significantly contributed to the model's performance. The area under the curve (AUC) values of 40 mL models for 10 antibiotics ranged from 0.8345 to 0.9995. We noted that the performance indices such as the AUC of the gene model (0.9311-0.9992) and the integrated model (0.9313-0.9995) were markedly better than the SNP model (0.8345-0.9933) and the k-mer model (0.9024-0.9969). The best model AUC values for six antibiotics-cefoxitin, tetracycline, methicillin, gentamicin, erythromycin, and clindamycin-were over 0.99; nine antibiotic models had AUC values over 0.96, and all could effectively predict AMR phenotypes. Additionally, we discovered that certain non-AMR genes, such as the X998_03220 gene, significantly contributed to drug resistance prediction and overlapped in various antibiotic-related models simultaneously. Our study developed ML models that can reliably predict AMR phenotypes for commonly used antibiotics in S. aureus. We also identified potential molecular markers that can contribute to precision medicine implementation and healthcare cost reduction.

IMPORTANCE: In our study, we developed a machine learning (ML) model that reliably predicts the antimicrobial resistance (AMR) phenotypes of Staphylococcus aureus to commonly used antibiotics. This model not only predicts AMR phenotypes but also identifies potential molecular markers, which could facilitate the implementation of precision medicine and contribute to reducing healthcare costs. The integration of diverse biomarker types is crucial for enhancing model performance; however, their effectiveness may vary depending on the specific antibiotic in question. Furthermore, our pan-genome-based characterization has revealed novel potential molecular markers associated with AMR, thereby enhancing our comprehension of the underlying molecular mechanisms of AMR in S. aureus. The expedited implementation of early and targeted antimicrobial therapies for S. aureus infections is essential for advancing precision medicine and can potentially lead to significant healthcare cost savings.

RevDate: 2025-08-05

Homsombat T, Yoshii K, Fukada Y, et al (2025)

Comparative Genomics of Edwardsiella piscicida in the Japanese Flounder (Paralichthys olivaceus): Discovery and Implications of a Novel Genomic Island.

Journal of fish diseases [Epub ahead of print].

Edwardsiella piscicida is a significant pathogen that poses a particular threat to Japanese flounder (Paralichthys olivaceus) aquaculture in Japan and other countries. The damage is caused by the pathogen's ability to evade host immune defences and establish intracellular infections, intensified by its genomic plasticity and capacity for horizontal gene transfer. To investigate evolutionary adaptations between one older (2019) and four recent (2023) E. piscicida strains from the same geographical locations, we performed comparative genomic analysis of five isolates using high-quality hybrid genome assemblies and compared them with 27 Edwardsiella reference genomes. Pangenome analysis identified distinct novel genomic islands (GIs) specific to the 2023 strains. These GIs (~100 kb in size) shared 85 gene clusters encoding multiple antibiotic resistance genes, phage defence systems, mobilisation genes, and mercury resistance. In addition, they encoded integrases, transposases, and conjugative transfer genes, suggesting they function as integrative and conjugative elements (ICEs), a type of mobile genetic element. Phenotypic characterisation showed the 2023 strains carrying novel GI increased antibiotic resistance, but no significant difference in virulence in Japanese flounder infection trials. These findings highlight the recent genomic diversification of E. piscicida in aquaculture and the importance of monitoring emerging GIs driving antibiotic resistance and environmental persistence.

RevDate: 2025-08-06

Zhao L, Hu Y, Ji QY, et al (2025)

Chromosome-level reference genome of Vitis piasezkii var. pagnucii provides insights into a new locus of resistance to grapevine powdery mildew.

Horticulture research, 12(9):uhaf146.

Grapevine powdery mildew (GPM), caused by Erysiphe necator, poses a significant threat to all green grapevine tissues, leading to substantial economic losses in viticulture. Traditional grapevine cultivars derived from Vitis vinifera are highly susceptible to GPM, whereas the wild Chinese accession Baishui-40 (BS-40) of V. piasezkii var. pagnucii exhibits robust resistance. To illuminate the genetic basis of resistance, we sequenced and assembled the chromosome-level genome of 'BS-40', achieving a total mapped length of 578.6 Mb distributed across nineteen chromosomes. A comprehensive annotation identified 897 nucleotide-binding leucine-rich repeat (NLR) genes in the 'BS-40' genome, which exhibited high sequence similarity across Vitis genomes. 284 of these NLR genes were differentially expressed upon GPM infection. A hybrid population of 'BS-40' and V. vinifera was constructed and 195 progenies were whole-genome re-sequenced. A new GPM-resistant locus, designated Ren17, located within the 0.74-1.23 Mb region on chromosome 1 was identified using genome-wide association study, population selection, and QTL analysis. Recombinant events indicated that an NLR gene cluster between 1 045 489 and 1 089 719 bp on chromosome 1 is possibly the key contributor to GPM resistance in 'BS-40'. Based on an SNP within this region, a dCAPS marker was developed that can predict the GPM resistance in 'BS-40'-derived materials with 99.4% accuracy in the progenies of 'BS-40' and V. vinifera. This chromosome-level genome assembly of V. piasezkii var. pagnucii provides a valuable resource not only for grapevine evolution, genetic analysis, and pan-genome studies but also a new locus Ren17 as a promising target for GPM-resistant breeding in grapevine.

RevDate: 2025-08-02

Huang C, Ding C, Tang J, et al (2025)

Evolutionary and functional characterization of tea plant DELLA proteins.

Plant physiology and biochemistry : PPB, 229(Pt A):110317 pii:S0981-9428(25)00845-9 [Epub ahead of print].

DELLA proteins function as key negative regulators in gibberellin signaling, driving numerous molecular mechanisms that impact plant morphogenesis and ontogeny. In this study, eight representative DELLA proteins were identified based on pan-genome analysis of Camellia sinensis var. assamica (CSA) and Camellia sinensis var. sinensis (CSS). Phylogenetically, these DELLA proteins were subdivided into five groups, and their evolutionary trajectories were systematically investigated. Five DELLA proteins identified in Longjing43, including a newly discovered CsDELLA5, were found to be crucial for growth and exhibited distinct expression patterns in different tissues of tea plant during seasonal transitions. The five proteins were mainly located in the nucleus, CsDELLA2 exhibited a spotted distribution in the cytoplasm. CsDELLAs also showed varied protein structure, hormone responses and expression patterns in bud sprouting. Notably, the presence of GAs significantly enhanced the interaction between CsDELLA2/4/5 and CsGID1a/b. Inhibition of the expression of CsDELLA2 and CsDELLA4 significantly increase the bud sprouting rate indicating a negative function in regulation of bud break. This study offers valuable insights into the roles of DELLA proteins in tea plants and provides a theoretical foundation for DELLA protein research in other species.

RevDate: 2025-08-01

Elias R, Phelan JE, Lito L, et al (2025)

Genome-Wide Analysis and Longitudinal Study of Klebsiella pneumoniae in Portugal: Tracing the Evolution and Spread of Carbapenem Resistance.

International journal of antimicrobial agents pii:S0924-8579(25)00138-4 [Epub ahead of print].

BACKGROUND: Carbapenem-resistant Klebsiella pneumoniae (CRKP) has high incidence in Portugal, causing severe and often fatal infections.

OBJECTIVES: Characterize the evolutionary history and epidemiology of CRKP in Portugal over a 40-year period.

METHODS: WGS was performed using the Illumina platform. In silico multilocus sequence typing, surface antigen characterization, and resistance gene detection were subsequently carried out. Core and pan-genome analyses were conducted using Roary. Genomic clusters (GCs) were identified based on a 21-SNP threshold. To estimate the divergence times of the most prevalent sequence types (ST) in the dataset, Bayesian evolutionary analysis was performed using BEAST.

RESULTS: Nineteen GCs harboring carbapenemases were identified. The blaKPC-3 gene was the most prevalent carbapenemase, linked to strains circulating in both hospital and community settings, with dissemination patterns at regional, interregional, and international levels. ST15 was the most established sequence type in Portugal, with nine distinct GCs identified in both clinical and environmental samples. Towards the end of 2010s, ST147 and ST13 were responsible for significant outbreaks associated with blaKPC-3.

CONCLUSIONS: This study underscores the value of genomic-based surveillance in understanding the evolution of high-risk clones coupled with the spread of AMR determinants. The data obtained highlights a shift in ST predominance across the country from an ST15-dominated period and strongly associated with ESBL dissemination, to the emergence of ST147 and ST13 CRKP clones, the latter associated with international transmission. This work further stresses the importance of cross-border surveillance efforts to monitor the emergence and dissemination of CRKP strains and inform risk assessment and prevention.

RevDate: 2025-08-09

Li W, Sun J, Wu Q, et al (2025)

Global genomics of Lactococcus lactis: horizontal gene transfer and intergenic variation drive multiple domestication and dairy adaptation.

Journal of advanced research pii:S2090-1232(25)00583-1 [Epub ahead of print].

INTRODUCTION: Lactococcus lactis is a crucial lactic acid bacterium of great economically significance for cheese product. The species exhibits wildly distribution and significant genetic diversity, yet the underlying drivers of its differentiation remain elusive.

OBJECTIVES: Lactococcus lactis, exhibits complex genetic diversity, yet the mechanisms driving its differentiation and niche adaptation remain poorly understood.

METHODS: This study assembled a genome dataset of 1008 isolates of Lactococcus lactis from six major habitats across five continents. And combined with public database data, used population genomics and function genomics to analysis the population structure and adaptation.

RESULTS: To elucidate its population structure and domestication history, 1008 genomes from six diverse habitats across five continents were analyzed, revealing two major genetic branches subdivided into ten distinct lineages. Phylogenomic and ancestral analyses support a multiple domestication model, with the ancestral plant-associated lineage (L6) diversified into dairy-adapted lineages (L8-L10) through extensive horizontal gene transfer, primarily facilitated by mobile genetic elements. Notably, intergenic regions (IGRs) critically influence phenotypic diversity and genetic structure, underscoring the functional significance of non-coding sequences in microbial adaptation. Pan-genome analysis highlights extensive accessory gene and IGR diversity, with habitat-specific enrichments: dairy lineages are enriched in mobile genetic elements and carbohydrate-active enzymes, while plant isolates show reduced genetic exchange. A machine learning framework integrating single nucleotide polymorphisms, genes, and IGRs accurately predicts isolate-specific fermentation traits, enabling efficient industrial strain selection.

CONCLUSION: These findings redefine non-coding regions as key drivers of microbial domestication and provide a genomic framework to optimize Lactococcus lactis for dairy fermentation and biotechnology, bridging ecological adaptation with applied innovation.

RevDate: 2025-08-13
CmpDate: 2025-08-01

Cui W, Fendley JM, Srikant S, et al (2025)

A minimal model of panimmunity maintenance by horizontal gene transfer in the ecological dynamics of bacteria and phages.

Proceedings of the National Academy of Sciences of the United States of America, 122(31):e2417628122.

Bacteria and phages have been in an ongoing arms race for billions of years. To resist phages bacteria have evolved numerous defense systems, which nevertheless are still overcome by counterdefense mechanisms of specific phages. These defense/counterdefense systems are a major element of microbial genetic diversity and have been demonstrated to propagate between strains by horizontal gene transfer (HGT). It has been proposed that the totality of defense systems found in microbial communities collectively form a distributed "pan-immune" system with individual elements moving between strains via ubiquitous HGT. Here, we formulate a Lotka-Volterra type model of a bacteria/phage community interacting via a combinatorial variety of defense/counterdefense systems and show that HGT enables stable maintenance of diverse defense/counterdefense genes in the microbial pan-genome even when individual microbial strains inevitably undergo extinction. This stability requires the HGT rate to be sufficiently high to ensure that some descendant of a "dying" strain survives, thanks to the immunity acquired through HGT from the community at large, thus establishing a new strain. This mechanism of persistence for the pan-immune gene pool is fundamentally similar to the "island migration" model of ecological diversity, with genes moving between genomes instead of species migrating between islands.

RevDate: 2025-08-15

Cui S, Ma W, Peng H, et al (2025)

Genome-wide mining reveals the genetic plasticity of antibiotic resistance/virulence factor genes in Enterobacter hormaechei subsp. xiangfangensis.

Journal of applied microbiology, 136(8):.

AIMS: This study aims to systematically characterize the genetic basis and intra-species differentiation of antibiotic resistance/virulence factor genes (ARGs/VFGs) in Enterobacter hormaechei subsp. xiangfangensis.

METHODS AND RESULTS: A high-quality metagenome-assembled genome of E. hormaechei subsp. xiangfangensis bin99 (97.22% completeness, 1.63% contamination) was acquired. Phylogenomic and average nucleotide identity (≥95%) analyses confirmed its taxonomic assignment. Pan-genomic analysis revealed an open configuration (Heap's exponent B = 0.34) with a large accessory genome (approximate 2965 genes) and a stabilized core genome (1139 genes). Critically, a strong positive correlation (r = 0.86, P < 2.2e-16) was observed between mobile genetic elements (MGEs) and accessory gene abundance, probably suggesting horizontal gene transfer (HGT) as a potential driver of genome diversity. Functional annotation highlighted distinct roles: core genes enriched in essential metabolism, while accessory/strain-specific genes were linked to adaptation. Screening identified significant inter-strain variation in ARGs (n = 31) and VFGs (n = 35). Bin99 itself harbored 19 ARGs (e.g. multidrug: soxS, ramA, oqxB) and 40 VFGs (e.g. flagella, T6SS). Importantly, MGE abundance showed a significant positive correlation with ARGs (r = 0.67, P < 2.2e-16) but a negative correlation with VFGs (r = -0.29, P < 3.7e-9), suggesting that ARGs were frequently linked to MGEs facilitating HGT-mediated spread, while VFGs might rely less on this route.

CONCLUSIONS: The findings provide genome-wide evidence for distinct genetic plasticity underlying ARG and VFG evolution in E. hormaechei subsp. xiangfangensis, highlighting implications for resistance and virulence dissemination.

RevDate: 2025-08-01
CmpDate: 2025-08-01

Velo J, Caipang CM, Noblezada A, et al (2025)

Whole genome sequence of Arthrobacter sp. from Iloilo City landfill soil unveils potential plastic biodegradation genes.

Biodegradation, 36(4):72.

Plastics are synthetic materials that have transformed society in a lot of ways, yet widespread use of these materials has caused a staggering amount of pollution in the environment. Among these plastics, polypropylene and low-density polyethylene are two of the most used plastics for packaging globally. Currently, only two enzymes were characterized for low density polyethylene degradation while no specific enzymes have been confirmed to degrade polypropylene. In this study, one bacterial isolate from landfill soil was assessed for potential polypropylene and low-density polyethylene degradation abilities using gravimetric methods by measuring the initial and final weight of plastic films. Results showed that after 60 days of incubation, a total decrease of 8.04% was observed for polypropylene plastics and 3.13% for low density polyethylene plastics. Whole genome sequencing using Illumina Nextseqâ„¢ 1000 generated a total number of 3,746,011 assembled base pairs for Isolate 1 using SPAdes. Phylogenetic tree construction using the Bacterial Pan-Genome Analysis (BPGA) tool revealed close relation of the isolate to Arthrobacter sp. Analysis of the annotated whole genome sequence against the Plastic database revealed 11 putative protein coding genes that encode enzymes with potential to break down plastics.

RevDate: 2025-08-03

Qiu Y, Guo P, Tian H, et al (2025)

The restriction impacts of the Type III restriction-modification system on the transmission dynamics of antimicrobial resistance genes in Campylobacter jejuni.

Frontiers in microbiology, 16:1496275.

INTRODUCTION: The spread of antibiotic resistance genes among Campylobacter jejuni (C. jejuni) is a serious problem, and the effects of the restriction-modification (R-M) system on the transmission dynamics of these genes in C. jejuni remain poorly understood.

MATERIALS AND METHODS: Complete genome sequences of C. jejuni strains were extracted from the BV-BRC database until March 25, 2024. The phylogenetic and the resistance analysis were used to analyze the distribution of resistance genes in C. jejuni. The impacts of the R-M systems on the AMR genes transmission between C. jejuni strains and the possible mechanisms were explored through recombination, pangenome and mobile genetic elements analysis.

RESULTS: C. jejuni strains carrying the Type III R-M system have a significantly lower number of antimicrobial resistance (AMR) genes compared to strains without this system (p < 0.0001), with covariance value being -0.0526. The recombination analysis also shows that the median number of the number of AMR genes in the strains not possessing the Type III R-M system increases by 19.38% compared to strains carrying that system (p < 0.0001). We also find that the horizontal gene transfer frequency might have limited relationship with the Type III R-M system in C. jejuni through pangenome and mobile genetic elements analysis.

CONCLUSION: Our research indicates that the Type III R-M system might restrict the transmission of AMR genes potentially by affecting recombination in C. jejuni, which provides a theoretical basis for addressing the drug resistance problem.

RevDate: 2025-07-31
CmpDate: 2025-07-31

Jha UC, Naik YD, Priya M, et al (2025)

Chickpea (Cicer arietinum L.) battling against heat stress: plant breeding and genomics advances.

Plant molecular biology, 115(4):101 pii:10.1007/s11103-025-01628-z.

Global climate change, particularly the increasing frequency and intensity of heat stress, poses a significant threat to crop productivity. Chickpea (Cicer arietinum L.) employs various physiological, biochemical, and molecular mechanisms to cope with elevated temperatures, including maintaining leaf chlorophyll content to preserve the functional integrity of photosystem II (PSII) and enhancing canopy temperature depression to reduce overheating. These traits are crucial for sustaining photosynthetic efficiency, plant health, and yield stability under heat stress. Recent advances in multi-omics approaches-including genomics, transcriptomics, proteomics, and metabolomics-have enhanced our understanding of the genetic basis of heat stress tolerance in chickpea. These tools have facilitated the identification of key genes and molecular pathways involved in heat stress responses. Functional characterization of these genes has provided insights into their roles within the complex metabolic and signaling networks that underpin heat resilience. This review explores integrating conventional and modern breeding technologies with high-throughput phenotyping (HTP) platforms to accelerate genetic gains in chickpea under heat stress. HTP tools enable rapid, precise screening of heat-resilient traits, facilitating early selection of superior genotypes. We also highlight recent genomic advancements, including genome-wide association studies, whole-genome resequencing, and pangenome assemblies, which have uncovered novel structural variants, candidate genes, and haplotypes associated with heat tolerance. Leveraging these resources in conjunction with functional analyses offers new opportunities for breeding climate-resilient chickpea cultivars capable of delivering stable yields and quality under adverse conditions. These developments are crucial for safeguarding chickpea productivity and ensuring global food and nutrition security amid climate change.

RevDate: 2025-07-31

Wang J, Chen D, Hu H, et al (2025)

Functional characterization of OsLT9 in regulating rice leaf thickness.

Journal of genetics and genomics = Yi chuan xue bao pii:S1673-8527(25)00207-3 [Epub ahead of print].

Leaf thickness in rice critically influences photosynthetic efficiency and yield, yet its genetic basis remains poorly understood, with few functional genes previously characterized. In this study, we employ a pangenome-wide association study (Pan-GWAS) on 302 diverse rice accessions from southern China, identifying 49 quantitative trait loci (QTLs) associated with leaf thickness. The most significant locus, qLT9, is fine-mapped to a 79 kb region on chromosome 9. Transcriptomic and genomic sequence analyses identify LOC_Os09g33480, which encodes a protein belonging to Multiple Organellar RNA Editing Factor (MORF) family, as the key candidate gene. Overexpression and complementation transgenic experiments confirm LOC_Os09g33480 (OsLT9) as the functional gene underlying qLT9, demonstrating a 24-bp Indel in its promoter correlates with the expression levels and leaf thickness. Notably, OsLT9 overexpression lines show not only thicker leaf, but also significantly enhanced photosynthetic efficiency and grain yield, establishing a link between leaf thickness modulation and yield enhancement. Population genomic analyses indicate strong selection for OsLT9 during domestication and breeding, with modern cultivars favoring thick leaf haplotype of OsLT9. This study establishes OsLT9 as a key regulator controlling leaf thickness in rice, and provides a valuable genetic resource for molecular breeding of high-yielding rice through optimization of plant architecture.

RevDate: 2025-08-02

Rotaru LI, M Surleac (2025)

PeGAS: a versatile bioinformatics pipeline for antimicrobial resistance, virulence and pangenome analysis.

Bioinformatics advances, 5(1):vbaf165.

MOTIVATION: Antimicrobial resistance is increasingly recognized as one of the most significant global health threats, with profound implications for human, animal, and environmental health. Genome analysis represents a very useful tool that provides accurate and reproducible results allowing for the advancement of knowledge regarding antimicrobial resistance diagnosis, therapeutics, surveillance, transmission, and evolution. However, due to increasing complexity of bacterial genome analysis and computational power required for genomic approaches, there is a continuous need for comprehensive, user-friendly tools for data analysis. We developed Pangenome and Genomic Analysis Suite (PeGAS), to address some of these challenges by offering an all-in-one pipeline that performs a range of analyses.

RESULTS: PeGAS integrates key genomic analysis features of bacteria whole genome sequencing, including the prediction of antimicrobial resistance profiles, sorted by various categories of antibiotics, VF detection, and plasmid replicon assignment. The pipeline also performs pangenome analysis, multilocus sequence typing, genome assembly quality control (by reporting statistics such as GC content, contig length, the number of contigs, as well as variation from certain GC thresholds) providing a comprehensive genomic overview. PeGAS also offers the ability to restart seamlessly from any sporadic interruptions that might occur during long or resource-intensive runs.

PeGAS is available at: https://github.com/liviurotiul/PeGAS.

RevDate: 2025-08-15
CmpDate: 2025-08-12

Wang L, Jiang X, Jiao W, et al (2025)

Pangenome analysis provides insights into legume evolution and breeding.

Nature genetics, 57(8):2052-2061.

Grain legumes hold great promise for advancing sustainable agriculture. Although the evolutionary history of legume species has been investigated, the conserved mechanisms that drive adaptive evolution and govern agronomic improvement remain elusive. Here we present high-quality genome assemblies for nine widely consumed pulses, including common bean, chickpea, pea, lentil, faba bean, pigeon pea, cowpea, mung bean and hyacinth bean. Pangenome analysis reveals the expansion of distinct gene sets in cool-season and warm-season legumes, highlighting the role of gene birth and duplication in the autoregulation of nodulation. Notably, hundreds of genes undergo convergent selection during the evolution of legumes, affecting agronomic traits such as seed weight. In addition, we demonstrate that tandem amplification of transposable elements in gene-depleted regions has a crucial role in driving genome enlargement and the formation of regulatory elements in cool-season legumes. Our results provide insights into the molecular mechanisms underlying the diversification of legumes and represent a valuable resource for facilitating legume breeding.

LOAD NEXT 100 CITATIONS

ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
961 Red Tail Lane
Bellingham, WA 98226

E-mail: RJR8222 @ gmail.com

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin and even a collection of poetry — Chicago Poems by Carl Sandburg.

Timelines

ESP now offers a large collection of user-selected side-by-side timelines (e.g., all science vs. all other categories, or arts and culture vs. world history), designed to provide a comparative context for appreciating world events.

Biographies

Biographical information about many key scientists (e.g., Walter Sutton).

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 28 JUL 2024 )