Viewport Size Code:
Login | Create New Account
picture

  MENU

About | Classical Genetics | Timelines | What's New | What's Hot

About | Classical Genetics | Timelines | What's New | What's Hot

icon

Bibliography Options Menu

icon
QUERY RUN:
HITS:
PAGE OPTIONS:
Hide Abstracts   |   Hide Additional Links
NOTE:
Long bibliographies are displayed in blocks of 100 citations at a time. At the end of each block there is an option to load the next block.

Bibliography on: Pangenome

The Electronic Scholarly Publishing Project: Providing world-wide, free access to classic scientific papers and other scholarly materials, since 1993.

More About:  ESP | OUR CONTENT | THIS WEBSITE | WHAT'S NEW | WHAT'S HOT

ESP: PubMed Auto Bibliography 01 Apr 2025 at 01:33 Created: 

Pangenome

Although the enforced stability of genomic content is ubiquitous among MCEs, the opposite is proving to be the case among prokaryotes, which exhibit remarkable and adaptive plasticity of genomic content. Early bacterial whole-genome sequencing efforts discovered that whenever a particular "species" was re-sequenced, new genes were found that had not been detected earlier — entirely new genes, not merely new alleles. This led to the concepts of the bacterial core-genome, the set of genes found in all members of a particular "species", and the flex-genome, the set of genes found in some, but not all members of the "species". Together these make up the species' pan-genome.

Created with PubMed® Query: ( pangenome OR "pan-genome" OR "pan genome" ) NOT pmcbook NOT ispreviousversion

Citations The Papers (from PubMed®)

-->

RevDate: 2025-03-29

Romanenko L, Bystritskaya E, Otstavnykh N, et al (2025)

Phenotypic and Genomic Characterization of Oceanisphaera submarina sp. nov. Isolated from the Sea of Japan Bottom Sediments.

Life (Basel, Switzerland), 15(3):.

A Gram-negative aerobic, motile bacterium KMM 10153[T] was isolated from bottom sediment sampled from the Sea of Japan at a depth of 256 m, Russia. Strain KMM 10153[T] grew in 0-12% NaCl at temperatures ranging from 4 to 42 °C and produced brown diffusible pigments. Based on the 16S rRNA gene and whole genome sequences analyses, novel bacterium KMM 10153[T] was affiliated with the genus Oceanisphaera (phylum Pseudomonadota) showing the highest 16S rRNA gene sequence similarities of 98.94% to Oceanisphaera arctica KCTC 23013[T], 98.15% to Oceanisphaera donghaensis BL1[T], and similarity values of <98% to other validly described Oceanisphaera species. The pairwise Average Nucleotide Identity (ANI) and Average Amino Acid Identity (AAI) values between the novel strain KMM 10153[T] and the three closest type strains Oceanishaera arctica KCTC 23013[T], Oceanisphaera litoralis DSM 15406[T] and Oceanisphaera sediminis JCM 17329[T] were 89.4%, 89.1%, 87.41%, and 90.7%, 89.8%, 89.7%, respectively. The values of digital DNA-DNA hybridization (dDDH) were below 39.3%. The size of the KMM 10153[T] draft genome was 3,558,569 bp, and the GC content was 57.5%. The genome of KMM 10153[T] harbors 343 unique genes with the most abundant functional classes consisting of transcription, mobilome, amino acid metabolism, and transport. Strain KMM 10153[T] contained Q-8 as the predominant ubiquinone and C16:1ω7c, C16:0, and C18:1ω7c as the major fatty acids. The polar lipids were phosphatidylethanolamine, phosphatidylglycerol, diphosphatidylglycerol, and phosphatidic acid. Based on the distinctive phenotypic characteristics and the results of phylogenetic and genomic analyses, the marine bacterium KMM 10153[T] could be classified as a novel Oceanisphaera submarina sp. nov. The type strain of the species is strain KMM 10153[T] (=KCTC 8836[T]).

RevDate: 2025-03-27

Nedashkovskaya O, Baldaev S, Ivaschenko A, et al (2025)

Description and Comparative Genomics of Algirhabdus cladophorae gen. nov., sp. nov., a Novel Aerobic Anoxygenic Phototrophic Bacterial Epibiont Associated with the Green Alga Cladophora stimpsonii.

Life (Basel, Switzerland), 15(3):.

A novel, strictly aerobic, non-motile, and pink-pigmented bacterium, designated 7Alg 153[T], was isolated from the Pacific green alga Cladophora stimpsonii. Strain 7Alg 153[T] was able to grow at 4-32 °C in the presence of 1.5-4% NaCl and hydrolyze L-tyrosine, gelatin, aesculin, Tweens 20, 40, and 80 and urea, as well as produce catalase, oxidase, and nitrate reductase. The novel strain 7Alg 153[T] showed the highest similarity of 96.75% with Pseudaestuariivita rosea H15[T], followed by Thalassobius litorarius MME-075[T] (96.60%), Thalassobius mangrovi GS-10[T] (96.53%), Tritonibacter litoralis SM1979[T] (96.45%), and Marivita cryptomonadis CL-SK44[T] (96.38%), indicating that it belongs to the family Roseobacteraceae, the order Rhodobacteales, the class Alphaproteobacteria, and the phylum Pseudomonadota. The respiratory ubiquinone was Q-10. The main polar lipids were phosphatidylethanolamine, phosphatidylglycerol, diphosphatidylglycerol, phosphatidylcholine, two unidentified aminolipids, and one unidentified lipid. The predominant cellular fatty acids (>5%) were C18:1 ω7c, C16:0, C18:0, and 11-methyl C18:1 ω7c. The 7Alg 153[T] genome is composed of a single circular chromosome of 3,786,800 bp and two circular plasmids of 53,157 bp and 37,459 bp, respectively. Pan-genome analysis showed that the 7Alg 153[T] genome contains 33 genus-specific clusters spanning 92 genes. The COG20-annotated singletons were more often related to signal transduction mechanisms, cell membrane biogenesis, transcription, and transport, and the metabolism of amino acids. The complete photosynthetic gene cluster (PGC) for aerobic anoxygenic photosynthesis (AAP) was found on a 53 kb plasmid. Based on the phylogenetic evidence and phenotypic and chemotaxonomic characteristics, the novel isolate represents a novel genus and species within the family Roseobacteraceae, for which the name Algirhabdus cladophorae gen. nov., sp. nov. is proposed. The type strain is 7Alg 153[T] (=KCTC 72606[T] = KMM 6494[T]).

RevDate: 2025-03-27
CmpDate: 2025-03-27

Miao J, Wang Q, Zhang Z, et al (2025)

Pangenome graph mitigates heterozygosity overestimation from mapping bias: a case study in Chinese indigenous pigs.

BMC biology, 23(1):89.

BACKGROUND: Breeds genetically distant from the reference genome often show considerable differences in DNA fragments, making it difficult to achieve accurate mappings. The genetic differences between pig reference genome (Sscrofa11.1) and Chinese indigenous pigs may lead to mapping bias and affect subsequent analyses.

RESULTS: Our analysis revealed that pangenome exhibited superior mapping accuracy to the Sscrofa11.1, reducing false-positive mappings by 1.4% and erroneous mappings by 0.8%. Furthermore, the pangenome yielded more accurate genotypes of SNP (F1: 0.9660 vs. 0.9607) and INDEL (F1: 0.9226 vs. 0.9222) compared to Sscrofa11.1. In real sequencing data, the inconsistent SNPs called from the pangenome exhibited lower genome heterozygosity compared to those identified by the Sscrofa11.1, including observed heterozygosity and nucleotide diversity. The same reduction of heterozygosity overestimation was also found in the chicken pangenome.

CONCLUSIONS: This study quantifies the mapping bias of Sscrofa11.1 in Chinese indigenous pigs, demonstrating that mapping bias can lead to an overestimation of heterozygosity in Chinese indigenous pig breeds. The adoption of a pig pangenome mitigates this bias and provides a more accurate representation of genetic diversity in these populations.

RevDate: 2025-03-27
CmpDate: 2025-03-27

Tahir Ul Qamar M, Fatima K, Rao MJ, et al (2025)

Comparative genomics profiling of Citrus species reveals the diversity and disease responsiveness of the GLP pangenes family.

BMC plant biology, 25(1):388.

Citrus is an important nutritional fruit globally; however, its yield is affected by various stresses. This study presents the draft pangenome of Citrus, developed using 11 species to examine their genetic diversity and identify members of the germin-like proteins (GLPs) gene family involved in disease responsiveness. The developed sequence-based pangenome contains 954 Mb sequence and 74,755 genes. The comparative genomics analysis revealed the presence-absence variations (PAVs) among the Citrus genomes and species-specific protein-coding genes. Gene-based pangenome analysis revealed 4,936 new genes missing in the reference genome and highlighted the core and shell genes with putative functions in stress regulation. The pangenome-wide identification of GLP gene family members indicated the intraspecies diversity among the members across 11 genomes by analyzing their gene structure, motifs, and chromosomal distribution patterns. The synteny and evolutionary constraints analyses of Citrus GLPs provide detailed evidence of their evolutionary conservation and divergence. Further, the interaction, functional enrichment, and promoter analysis revealed their involvement in abiotic-, biotic-stress, signaling, and development-related pathways. The expression patterns of C. sinensis GLPs were studied in Huanglongbing (HLB) and Citrus canker disease. Several genes including CsGLPs1-2 and CsGLPs8-4 showed changes in expression patterns under both disease conditions. The qRT-PCR analysis revealed that these two genes were highly expressed in leaves infected with HLB disease across seven HLB-tolerant and susceptible citrus species. This Citrus pangenome and pangenes family study offers a comprehensive resource and new insights into the structural and functional diversity, identifying candidate genes that are important for future research to understand the stress-responsive mechanisms in Citrus.

RevDate: 2025-03-26

Zheng Z, Lv J, Niu Z, et al (2025)

Genetic insights into developmental variations of spiny bracts among hazels through the pangenome construction.

Plant biotechnology journal [Epub ahead of print].

RevDate: 2025-03-27

Cheng G, An X, Dai Y, et al (2025)

Genomic Insights into Cobweb Disease Resistance in Agaricus bisporus: A Comparative Analysis of Resistant and Susceptible Strains.

Journal of fungi (Basel, Switzerland), 11(3):.

Agaricus bisporus, a globally cultivated edible fungus, faces significant challenges from fungal diseases like cobweb disease caused by Cladobotryum mycophilum, which severely impacts yield. This study aimed to explore the genetic basis of disease resistance in A. bisporus by comparing the genomes of a susceptible strain (AB7) and a resistant strain (AB58). Whole-genome sequencing of AB7 was performed using PacBio Sequel SMRT technology, and comparative genomic analyses were conducted alongside AB58 and other fungal hosts of C. mycophilum. Comparative genomic analyses revealed distinct resistance features in AB58, including enriched regulatory elements, specific deletions in AB7 affecting carbohydrate-active enzymes (CAZymes), and unique cytochrome P450 (CYP) profiles. Notably, AB58 harbored more cytochrome P450 genes related to fatty acid metabolism and unique NI-siderophore synthetase genes, contributing to its enhanced environmental adaptability and disease resistance. Pan-genome analysis highlighted significant genetic diversity, with strain-specific genes enriched in pathways like aflatoxin biosynthesis and ether lipid metabolism, suggesting distinct evolutionary adaptations. These findings provide valuable insights into the genetic basis underlying disease resistance in A. bisporus, offering a foundation for future breeding strategies to improve fungal crop resilience.

RevDate: 2025-03-27

Bello A, Ning S, Zhang Q, et al (2025)

Genomic analysis of multidrug-resistant Escherichia coli isolated from dairy cows in Shihezi city, Xinjiang, China.

Frontiers in microbiology, 16:1527546.

INTRODUCTION: Dairy farming plays a vital role in agriculture and nutrition; however, the emergence of antimicrobial resistance (AMR) among bacterial pathogens poses significant risks to public health and animal welfare. Multidrug-resistant (MDR) Escherichia coli strains are of particular concern due to their potential for zoonotic transmission and resistance to multiple antibiotics. In this study, we investigated the prevalence of AMR and analyzed the genomes of two MDR E. coli isolated from dairy cows in Shihezi City.

METHODS: Fecal samples were collected from dairy cows, and E. coli strains were isolated. Antibiotic susceptibility testing was conducted using the Kirby-Bauer disk diffusion method against 14 antibiotics. Two MDR isolates (E.coli_30 and E.coli_45) were selected for whole-genome sequencing and comparative genomic analysis. The Comprehensive Antibiotic Resistance Database (CARD) was used to identify AMR genes, and virulence factors were analyzed. Phylogenetic analysis was performed to determine the evolutionary relationships of the isolates, and a pangenome analysis of 50 E. coli strains was conducted to assess genetic diversity. The presence of mobile genetic elements (MGEs), including insertion sequences (IS) and transposons, was also examined.

RESULTS: Among the E. coli isolates, 22.9% exhibited MDR, with high resistance to imipenem and ciprofloxacin, while gentamicin and tetracycline remained the most effective antibiotics. Genomic analysis revealed key AMR genes, including mphA, qnrS1, and bla CTX-M-55 (the latter found only in E.coli_45), conferring resistance to macrolides, quinolones, and beta-lactams, respectively. Virulence genes encoding type III secretion systems (TTSS) and adhesion factors were identified, indicating pathogenic potential. Phylogenetic analysis showed that E.coli_30 and E.coli_45 originated from distinct ancestral lineages. The presence of two extended-spectrum β-lactamase (ESBL) genes in E.coli_45 was noticeable, so we studied their global and national distribution using evolutionary analysis. We found that they are endemic in E. coli, Salmonella enterica, and Klebsiella pneumoniae. Pangenome analysis revealed significant genetic diversity among E. coli strains, with unique genes related to metabolism and stress response. This indicates the bacteria's adaptation to various environments. MGEs were identified as key contributors to genetic variability and adaptation.

DISCUSSION: This study highlights the growing threat of MDR E. coli in dairy farms, emphasizing the critical role of MGEs in the spread of resistance genes. The genetic diversity observed suggests strong adaptive capabilities, justifying the need for continuous AMR surveillance in livestock. Effective monitoring and mitigation strategies are essential to prevent the dissemination of MDR bacteria, thereby protecting both animal and public health.

RevDate: 2025-03-26
CmpDate: 2025-03-26

Innamorati KA, Earl JP, Barrera SC, et al (2025)

Metronidazole response profiles of Gardnerella species are congruent with phylogenetic and comparative genomic analyses.

Genome medicine, 17(1):28.

BACKGROUND: Bacterial vaginosis (BV) affects 20-50% of reproductive-age female patients annually, arising when opportunistic pathogens outcompete healthy vaginal flora. Many patients fail to resolve symptoms with a course of metronidazole, the current first-line treatment for BV. Our study was designed to identify genomic variation associated with metronidazole resistance among strains of Gardnerella vaginalis spp. (GV), a genus of biogenic-amine-producing bacteria closely associated with BV pathogenesis, for the development of a companion molecular diagnostic.

METHODS: Whole-genome sequencing and comparative genomic metrics, including average nucleotide identity and GC content, were performed on a diverse set of 129 GV genomes to generate data for detailed taxonomic analyses. Pangenomic analyses were employed to construct a phylogenetic tree and cluster highly related strains within genospecies. G. vaginalis spp. clinical isolates within our collection were subjected to plate-based minimum inhibitory concentration (MIC) testing of metronidazole (n = 60) and clindamycin (n = 63). DECIPHER and MAFFT were used to identify genospecies-specific primers associated with antibiotic-resistance phenotypes. PCR-based analyses with these primers were used to confirm their specificity for the relevant genospecies.

RESULTS: Eleven distinct genospecies based on standard ANI criteria were identified among the GV strains in our collection. Metronidazole MIC testing revealed six genospecies within a closely related phylogenetic clade contained only highly metronidazole-resistant strains (MIC ≥ 32 µg/mL) and suggested at least two mechanisms of metronidazole resistance within the eleven GV genospecies. All strains within the six highly metronidazole-resistant genospecies displayed susceptibility to clinically relevant clindamycin concentrations (MIC ≤ 2 µg/mL). A PCR-based molecular diagnostic assay was developed to distinguish between members of the metronidazole-resistant and mixed-response genospecies, which should be useful for determining the clade membership of various GV strains and could assist in the selection of appropriate antibiotic therapies for BV cases.

CONCLUSIONS: This study provides comparative genomic and phylogenetic evidence for eleven distinct genospecies within the genus Gardnerella vaginalis spp., and identifies genospecies-specific responses to metronidazole, the first-line treatment for BV. A companion molecular diagnostic assay was developed that is capable of identifying essentially all highly metronidazole-resistant strains that phylogenetically cluster together within the GV genospecies, which is informative for antibiotic treatment options.

RevDate: 2025-03-26
CmpDate: 2025-03-26

Wan L, Deng C, Liu B, et al (2025)

Telomere-to-telomere genome assemblies of three silkworm strains with long-term pupal characteristics.

Scientific data, 12(1):501.

The domesticated silkworm (Bombyx mori) is both economically significant and a valuable model organism. However, challenges persist in silk production, particularly in preserving silkworm cocoons. The wild silkworm (Bombyx mandarina), a close relative, with long-term pupal characteristics, could address storage and industrial silk production issues. We conducted interspecies hybridization between domestic and wild silkworms, successfully introducing the long-pupal period trait into the domestic silkworm through genomic integration. Here, we presented the telomere-to-telomere genome assemblies of three silkworm strains (KA, L, and M) with long-term pupal characteristics. The genome assembly sizes ranged from 453.82 Mb to 461.92 Mb, with high contig N50 values and completeness. We predicted over 14,000 protein-coding genes and identified strain-specific fragments. This research enriches the domestic silkworm pan-genome project and provides a foundation for further genetic studies. By introducing the trait, we have for the first time reported a phenomenon of genomic introgression between domestic and wild silkworm, and have also opened up a new avenue for silkworm breeding.

RevDate: 2025-03-25

Silva UCM, da Silva DRC, Cuadros-Orellana S, et al (2025)

Genomic and phenotypic insights into Serratia interaction with plants from an ecological perspective.

Brazilian journal of microbiology : [publication of the Brazilian Society for Microbiology] [Epub ahead of print].

We investigated the plant growth-promoting potential of two endophytic strains of Serratia marcescens, namely SmCNPMS2112 and SmUFMG85, which were isolated from the roots of the same maize (Zea mays) plant. The strains were evaluated in vitro for their ability to produce siderophores and indoleacetic acid, form biofilm, solubilize iron phosphate (Fe-P) and Araxá rock phosphate (RP), mineralize phytate, and for their ability to adhere and colonize host roots. Additionally, their plant growth-promoting potential was tested in vivo under greenhouse conditions using millet grown in soil under two fertilization schemes (triple superphosphate, TSP, or commercial rock phosphate, cRP). Both strains improved at least five physiological traits of millet or P content in soil. In order to elucidate the genetic basis of the plant growth-promoting ability of these strains, their genomes were compared. While both genomes exhibited a similar overall functional profile, each strain had unique features. SmCNPMS2112 contained genes related to arsenic and aromatic hydrocarbons degradation, whereas SmUFMG85 harbored genes related to rhamnolipid biosynthesis and chromium bioremediation. Also, we observe a unique repertoire of genes related to plant growth-promotion (PGP) in the SmUFMG85 genome, including oxalate decarboxylase (OxdC), associated with the catabolism of oxalic acid, and aerobactin siderophore (lucD) in the genome of SmCNPMS2112. The alkaline phosphatase was observed on two strains, but acid phosphatase was exclusive to SmUFMG85. Eighteen secondary metabolic gene clusters, such as those involved in the biosynthesis of macrolides and bacillomycin, among others, occur in both strains. Moreover, both genomes contained prophages, suggesting that viral-mediated horizontal gene transfer may be a key mechanism driving genomic variability in the endophytic environment. Indeed, the most genes unique and accessory of SmUFMG85 and SmCNPMS2112 were localized in genomic islands, highlighting genome plasticity and its underlying drivers. To investigate the ecological distribution of plant-interaction traits in the genus Serratia, the genomes of SmUFMG85 and SmCNPMS2112 strains were compared with those of other 19 Serratia strains of different species, which were isolated from different environments. We observe that many features for PGP are present in all genomes, regardless of niche, for instance: formation of flagella, fimbriae and pili, chemotaxis, biosynthesis of siderophores, indole-3-acetic acid (IAA) and volatile organic (VOC) and inorganic (VIC) compounds, such as acetoin and HCN. Also, all the analyzed genomes show an antimicrobial resistance repertoire of genes that confer resistance to several antibiotics belonging to the groups of aminoglycosides and quinolones, for instance. Also, from a niche partitioning perspective, secretion system preference and the ability to produce exopolysaccharides involved in biofilm formation are among the features that vary the most among strains, and most likely influence niche adaptation in Serratia spp., even though only the latter seems to be a feature specifically associated with virulence in the analyzed strains. Our results show that populations of bacteria sharing the same niche can present significant physiological and genomic differences, and reveal the intraspecific metabolic plasticity that underlie plant-bacteria interactions. Also, this study reveals the potential of two Serratia marcescens strains as bioinoculants in agriculture. Considering that Serratia spp. are regarded as low risk biological agents, despite the fact that they can be associated with human disease, we suggest that strain biosafety be evaluated using a combination of genome and phenotypic analyses, as presented herein.

RevDate: 2025-03-26
CmpDate: 2025-03-25

Bixler BJ, Royer CJ, Petit Iii RA, et al (2025)

Comparative genomic analysis of emerging non-typeable Haemophilus influenzae (NTHi) causing emerging septic arthritis in Atlanta.

PeerJ, 13:e19081.

BACKGROUND: Haemophilus influenzae is a Gram-negative bacterium that can exist as a commensal organism or cause a range of diseases, from ear infections to invasive conditions like meningitis. While encapsulated H. influenzae strains have historically been linked to severe diseases, non-typeable Haemophilus influenzae (NTHi) strains, lacking an intact capsule locus, have emerged as the leading cause of invasive H. influenzae infections, particularly following the widespread use of the H. influenzae serotype b (Hib) vaccine.

METHODS: In response to a significant increase in invasive NTHi infections among persons living with HIV in metropolitan Atlanta during 2017-2018, we conducted a comparative genomic analysis of two predominant NTHi clones, C1 and C2, identified during this period. These clones correspond to multilocus sequence types ST164 and ST1714, respectively. We analyzed the genomic characteristics of C1 and C2 using whole genome sequencing data and compared them to a broader pangenome of H. influenzae strains to identify potential virulence factors and genetic adaptations.

RESULTS: Both C1 and C2 isolates were highly related within their clusters, with C1 showing a maximum of 132 SNPs and C2 showing 149 SNPs within their respective core genomes. Genomic analysis revealed significant deletions in known virulence genes, surprisingly suggesting possible attenuation of virulence. No unique accessory genes were identified that distinguished C1 and C2 from other H. influenzae strains, although both clusters exhibited a consistent loss of the pxpB gene (encoding 5-oxoprolinase subunit), replaced by a mobile cassette containing genes potentially involved in sugar metabolism. All C1 and C2 isolates showed potential enrichment in accessory genes associated with systemic infections.

CONCLUSIONS: Our study suggests that while C1 and C2 clones possess some genetic markers potentially linked to systemic infections, there are no definitive unique genetic factors that distinguish these clones as more virulent than other H. influenzae strains. The expansion of these clones in a vulnerable population may reflect both chance introduction and potential adaptations to the host environment. Further research is needed to understand the implications of these genetic findings on the clinical management and prevention of invasive NTHi infections.

RevDate: 2025-03-25

Stoltze U, Junk SV, Byrjalsen A, et al (2025)

Overt and covert genetic causes of pediatric acute lymphoblastic leukemia.

Leukemia [Epub ahead of print].

Pediatric acute lymphoblastic leukemia (pALL) is the most common childhood malignancy, yet its etiology remains incompletely understood. However, over the course of three waves of germline genetic research, several non-environmental causes have been identified. Beginning with trisomy 21, seven overt cancer predisposition syndromes (CPSs)-characterized by broad clinical phenotypes that include an elevated risk of pALL-were first described. More recently, newly described CPSs conferring high risk of pALL are increasingly covert, with six exhibiting only minimal or no non-cancer features. These 13 CPSs now represent the principal known hereditary causes of pALL, and human pangenomic data indicates a strong negative selection against mutations in the genes associated with these conditions. Collectively they affect approximately 1 in 450 newborns, of which just a minority will develop the disease. As evidenced by tailored leukemia care protocols for children with trisomy 21, there is growing recognition that CPSs warrant specialized diagnostic, therapeutic, and long-term management strategies. In this review, we investigate the evidence that the 12 other CPSs associated with high risk of pALL may also see benefits from specialized care - even if these needs are often incompletely mapped or addressed in the clinic. Given the rarity of each syndrome, collaborative international research and shared data initiatives will be crucial for advancing knowledge and improving outcomes for these patients.

RevDate: 2025-03-25

Li L, Wu Z, Guarracino A, et al (2025)

Genetic modulation of protein expression in rat brain.

iScience, 28(3):112079.

Genetic variations in protein expression are implicated in a broad spectrum of common diseases and complex traits but remain less explored compared to mRNA and classical phenotypes. This study systematically analyzed brain proteomes in a rat family using tandem mass tag (TMT)-based quantitative mass spectrometry. We quantified 8,119 proteins across two parental strains (SHR/Olalpcv and BN-Lx/Cub) and 29 HXB/BXH recombinant inbred (RI) strains, identifying 597 proteins with differential expression and 464 proteins linked to cis-acting quantitative trait loci (pQTLs). Proteogenomics identified 95 variant peptides, and sex-specific analyses revealed both shared and distinct cis-pQTLs. We improved the ability to pinpoint candidate genes underlying pQTLs by utilizing the rat pangenome and explored the connections between pQTLs in rats and human disorders. Collectively, this study highlights the value of large proteo-genetic datasets in elucidating protein modulation in the brain and its links to complex central nervous system (CNS) traits.

RevDate: 2025-03-22

Bigey F, Menatong Tene X, Wessner M, et al (2025)

Insights into the genomic and phenotypic diversity of Monosporozyma unispora strains isolated from anthropic environments.

FEMS yeast research pii:8090502 [Epub ahead of print].

Food microorganisms have been employed for centuries for the processing of fermented foods, leading to adapted populations with phenotypic traits of interest. The yeast Monosporozyma unispora (formerly Kazachstania unispora) has been identified in a wide range of fermented foods and beverages. Here, we studied the genetic and phenotypic diversity of a collection of 53 strains primarily derived from cheese, kefir, and sourdough. The 12.7 Mb genome of the type strain CLIB 234T was sequenced and assembled into near-complete chromosomes and annotated at the structural and functional levels, with 5639 coding sequences predicted. Comparison of the pangenome and core genome revealed minimal differences. From the complete yeast collection, we gathered genetic data (diversity, phylogeny, population structure) and phenotypic data (growth capacity on solid media). Population genomic analyses revealed low level of nucleotide diversity and strong population structure, with the presence of two major clades corresponding to ecological origins (cheese and kefir vs. plant derivatives). A high prevalence of extensive loss of heterozygosity and a slow linkage disequilibrium decay suggested a predominantly clonal mode of reproduction. Phenotypic analyses revealed growth variation under stress conditions, including high salinity and low pH, but no definitive link between phenotypic traits and environmental adaptation was established.

RevDate: 2025-03-22

Chen J, Yu Q, Zhang T, et al (2025)

Quorum sensing luxI/R genes enhances cadmium detoxification in Aeromonas by up-regulating EPS production and cadmium resistance genes.

Journal of hazardous materials, 491:137959 pii:S0304-3894(25)00875-1 [Epub ahead of print].

The increasing cadmium (Cd) contamination in the environment poses a serious threat to ecosystem health and human safety. This study investigated the roles of quorum sensing (QS) genes luxI/R, key components of the QS system, in the Cd accumulation and detoxification in Aeromonas. Pan-genome analysis showed that luxI/R and Cd resistance genes were highly conserved in Aeromonas species. Strains of luxI/R knockout, complementation and overexpression were constructed via homologous recombination. The luxI/R deletion significantly reduced Cd removal by up to 32 %, decreased extracellular protein (18-36 %) and polysaccharide (19-33 %) contents, whereas luxI/R overexpression enhanced Cd removal capacity by 11 %. Transcriptomic and metabolomic analyses further revealed coordinated changes. In the ΔluxI/R strain, genes involved in assimilatory sulfate reduction and arginine biosynthesis were downregulated, accompanied by reduced levels of glycerophospholipid, vitamin, and cytochrome P450-related metabolites. In contrast, luxI/R overexpression upregulated arginine synthesis (2.0-3.5 fold) and sulfate assimilation (1.4-2.4 fold) genes, with corresponding increases of metabolites. Together these findings demonstrate that luxI/R genes may play a crucial role in regulation of EPS production and Cd resistance gene expression, thus enhancing our understanding of microbial Cd detoxification mechanisms.

RevDate: 2025-03-21

Cuecas A, Delgado JA, JM González (2025)

Inferring inter-phylum gene transfer events from unique genes detected in Parageobacillus thermoglucosidasius.

Molecular phylogenetics and evolution pii:S1055-7903(25)00046-6 [Epub ahead of print].

A pan-genome includes the complete pool of genes of a species including those recently acquired. The new additions of genetic material to a genome are frequently linked to horizontal gene transfer (HGT) processes and can confer adaptive advantages improving the recipient functional response and growth. Previous studies have reported that Parageobacillus have frequent DNA exchange mainly with other members of the phylum Bacillota sharing similar environments. Nevertheless, the occurrence of transfer events between phylogenetically distant microorganisms is scarcely known. In this work, based on the pan-genome of Parageobacillus thermoglucosidasius, we detected a number of unique genes within the species which were used to carry out BLAST searches to find out similar genes in distant bacteria taxa. We aimed to infer potential inter-phylum HGT events. Results suggested genetic exchanges among different phyla. Among them Actinomycetota, Pseudomonadota and the Bacteroidota/Chlorobiota group were the dominant observed phyla. Those HGT events frequently involved ATP binding cassette transporters, enzymes of the C metabolism and transcriptional regulators. Based on the frequency of these genes within specific phyla, directional HGT events could be proposed. A dominant origin of the suggested HGT events could be within the Bacillota. This exploratory analysis indicates that Bacillota are frequent exporters of DNA both within the phylum and to phylogenetically distant groups. Long-distance HGT can assist to better understand microbial evolution, the relevance of HGT processes within the prokaryotes and the genomic plasticity of microorganisms.

RevDate: 2025-03-20

Thudi M, Mascher M, M Jayakodi (2025)

Pangenome charts the genomic path for wheat improvement.

Trends in plant science pii:S1360-1385(25)00062-7 [Epub ahead of print].

A wheat pangenome of 17 Chinese cultivars, recently developed by Jiao et al., reveals structural variants (SVs) shaped by cultural, dietary, and environmental changes. This resource provides access to East Asian wheat genetic diversity and supports genome-driven efforts to advance wheat improvement and adaptation to changing agricultural demands.

RevDate: 2025-03-20
CmpDate: 2025-03-20

Wu XM, Li ZP, Huang J, et al (2025)

[Pangenome analysis on plasmids carried by hypervirulent Klebsiella pneumoniae].

Zhonghua liu xing bing xue za zhi = Zhonghua liuxingbingxue zazhi, 46(3):506-513.

Objective: To analyze the pangenome, pan drug resistance genes, pan virulence genes, pan replicons, and others of the plasmids carried by hypervirulent Klebsiella pneumoniae (hvKP) in the world and their evolutionary trends over time, and provide evidence for more comprehensive understanding of the evolution of genetic diversity, drug resistance genes, and virulence genes of the plasmids. Methods: From the National Center for Biotechnology Information database, a total 1 738 plasmids were screened from 524 strains with completed genome sequences in 2 136 strains of hvKP carrying plasmids. Through pangenome, pan drug resistance gene, and pan-virulence gene composition and functional analyses, the curves of pangenome size and new gene size against plasmid isolation time were established, revealing the diversity of the plasmid pangenome and its evolutionary patterns. Results: The homologous genes, homologous drug resistance genes, homologous virulence genes, and replicons of the plasmids carried by hvKP comprised of 12 906, 149, 107 and 89 types, respectively. The fitting curves for the number of new genes, new drug resistance genes and new replicons increased with the increase of plasmids in an open state, while the curve for novel virulence genes was in a closed state. A obvious increase in new drug resistance genes was observed during 2018-2019. Among the newly added drug resistance genes during 2021-2023, beside those conferring aminoglycoside resistance, they were mainly new subtypes conferring carbapenem resistance. Conclusions: The pangenome of plasmids carried by hvKP exhibited high diversity, with the plasmid pan genes, pan drug resistance genes, and pan replicon types gradually expanding, while the pan virulence genes remains stable. The increase in novel drug resistance genes in specific years and the emergence of new carbapenem-resistant gene subtypes during 2021-2023 suggested the need for strengthened drug resistance surveillance and prevention efforts, with particular attention to carbapenem resistance.

RevDate: 2025-03-20

Groza C, Ge B, Cheung WA, et al (2025)

Expanded methylome and quantitative trait loci detection by long-read profiling of personal DNA.

Genome research pii:gr.279240.124 [Epub ahead of print].

Structural variants (SVs) are omnipresent in human DNA, yet their genotype and methylation statuses are rarely characterized due to previous limitations in genome assembly and detection of modified nucleotides. Also, the extent to which SVs act as methylation quantitative trait loci (SV-mQTLs) is largely unknown. Here, we generated a pangenome graph summarizing SVs in 782 de novo assemblies obtained from Genomic Answers for Kids, capturing 14.6 million CpG dinucleotides that are absent from the CHM13v2 reference (SV-CpGs), thus expanding their number by 43.6%. Using 435 methylomes, we genotyped 4.06 million SV-CpGs, of which 3.93 million (96.8%) are methylated at least once. Nonrepeat sequences contribute 1.59 × 10[6] novel SV-CpGs, followed by centromeric satellites (6.57 × 10[5]), simple repeats (5.40 × 10[5]), Alu elements (5.07 × 10[5]), satellites (2.17 × 10[5]), LINE-1s (1.83 × 10[5]), and SVA (SINE-VNTR-Alu) elements (1.50 × 10[5]). Centromeric satellites, simple repeats, and SVAs are overrepresented in SV-CpGs versus reference CpGs. Similarly, methylation levels in SV-CpGs are more variable than in reference CpGs. To explore if SVs are potentially causal for functional variation, we measured SV-mQTLs. This revealed over 230,464 methylation bins where the methylation is associated with common SVs within 100 kbp. Finally, we identified 65,659 methylation bins (28.5%) where the leading QTL variant is an SV. In conclusion, we demonstrate that graph pangenomes provide full SV structures, the associated methylation variation, and reveal tens of thousands of SV-mQTLs, underscoring the importance of assembly based analyses of human traits.

RevDate: 2025-03-20

Li Q, Keskus AG, Wagner J, et al (2025)

Unraveling the hidden complexity of cancer through long-read sequencing.

Genome research pii:gr.280041.124 [Epub ahead of print].

Cancer is fundamentally a disease of the genome, characterized by extensive genomic, transcriptomic, and epigenomic alterations. Most current studies predominantly use short-read sequencing, gene panels, or microarrays to explore these alterations; however, these technologies can systematically miss or misrepresent certain types of alterations, especially structural variants, complex rearrangements, and alterations within repetitive regions. Long-read sequencing is rapidly emerging as a transformative technology for cancer research by providing a comprehensive view across the genome, transcriptome, and epigenome, including the ability to detect alterations that previous technologies have overlooked. In this review, we explore the current applications of long-read sequencing for both germline and somatic cancer analysis. We provide an overview of the computational methodologies tailored to long-read data and highlight key discoveries and resources within cancer genomics that were previously inaccessible with prior technologies. We also address future opportunities and persistent challenges, including the experimental and computational requirements needed to scale to larger sample sizes, the hurdles in sequencing and analyzing complex cancer genomes, and opportunities for leveraging machine learning and artificial intelligence technologies for cancer informatics. We further discuss how the telomere-to-telomere genome and the emerging human pangenome could enhance the resolution of cancer genome analysis, potentially revolutionizing early detection and disease monitoring in patients. Finally, we outline strategies for transitioning long-read sequencing from research applications to routine clinical practice.

RevDate: 2025-03-20
CmpDate: 2025-03-20

Roberts MD, Davis O, Josephs EB, et al (2025)

K-mer-based Approaches to Bridging Pangenomics and Population Genetics.

Molecular biology and evolution, 42(3):.

Many commonly studied species now have more than one chromosome-scale genome assembly, revealing a large amount of genetic diversity previously missed by approaches that map short reads to a single reference. However, many species still lack multiple reference genomes and correctly aligning references to build pangenomes can be challenging for many species, limiting our ability to study this missing genomic variation in population genetics. Here, we argue that k-mers are a very useful but underutilized tool for bridging the reference-focused paradigms of population genetics with the reference-free paradigms of pangenomics. We review current literature on the uses of k-mers for performing three core components of most population genetics analyses: identifying, measuring, and explaining patterns of genetic variation. We also demonstrate how different k-mer-based measures of genetic variation behave in population genetic simulations according to the choice of k, depth of sequencing coverage, and degree of data compression. Overall, we find that k-mer-based measures of genetic diversity scale consistently with pairwise nucleotide diversity (π) up to values of about π=0.025 (R2=0.97) for neutrally evolving populations. For populations with even more variation, using shorter k-mers will maintain the scalability up to at least π=0.1. Furthermore, in our simulated populations, k-mer dissimilarity values can be reliably approximated from counting bloom filters, highlighting a potential avenue to decreasing the memory burden of k-mer-based genomic dissimilarity analyses. For future studies, there is a great opportunity to further develop methods to identifying selected loci using k-mers.

RevDate: 2025-03-20

Yu X, Qu M, Wu P, et al (2025)

Correction: Super pan-genome reveals extensive genomic variations associated with phenotypic divergence in Actinidia.

Molecular horticulture, 5(1):31.

RevDate: 2025-03-19

Choudoir MJ, Narayanan A, Rodriguez-Ramos D, et al (2025)

Pangenomes suggest ecological-evolutionary responses to experimental soil warming.

mSphere [Epub ahead of print].

Below-ground carbon transformations that contribute to healthy soils represent a natural climate change mitigation, but newly acquired traits adaptive to climate stress may alter microbial feedback mechanisms. To better define microbial evolutionary responses to long-term climate warming, we study microorganisms from an ongoing in situ soil warming experiment where, for over three decades, temperate forest soils are continuously heated at 5°C above ambient. We hypothesize that across generations of chronic warming, genomic signatures within diverse bacterial lineages reflect adaptations related to growth and carbon utilization. From our bacterial culture collection isolated from experimental heated and control plots, we sequenced genomes representing dominant taxa sensitive to warming, including lineages of Actinobacteria, Alphaproteobacteria, and Betaproteobacteria. We investigated genomic attributes and functional gene content to identify signatures of adaptation. Comparative pangenomics revealed accessory gene clusters related to central metabolism, competition, and carbon substrate degradation, with few functional annotations explicitly associated with long-term warming. Trends in functional gene patterns suggest genomes from heated plots were relatively enriched in central carbohydrate and nitrogen metabolism pathways, while genomes from control plots were relatively enriched in amino acid and fatty acid metabolism pathways. We observed that genomes from heated plots had less codon bias, suggesting potential adaptive traits related to growth or growth efficiency. Codon usage bias varied for organisms with similar 16S rrn operon copy number, suggesting that these organisms experience different selective pressures on growth efficiency. Our work suggests the emergence of lineage-specific trends as well as common ecological-evolutionary microbial responses to climate change.IMPORTANCEAnthropogenic climate change threatens soil ecosystem health in part by altering below-ground carbon cycling carried out by microbes. Microbial evolutionary responses are often overshadowed by community-level ecological responses, but adaptive responses represent potential changes in traits and functional potential that may alter ecosystem function. We predict that microbes are adapting to climate change stressors like soil warming. To test this, we analyzed the genomes of bacteria from a soil warming experiment where soil plots have been experimentally heated 5°C above ambient for over 30 years. While genomic attributes were unchanged by long-term warming, we observed trends in functional gene content related to carbon and nitrogen usage and genomic indicators of growth efficiency. These responses may represent new parameters in how soil ecosystems feedback to the climate system.

RevDate: 2025-03-20

Gayathri M, Sharanya R, Renukadevi P, et al (2025)

Genomic configuration of Bacillus subtilis (NMB01) unveils its antiviral activity against Orthotospovirus arachinecrosis infecting tomato.

Frontiers in plant science, 16:1517157.

Orthotospovirus arachinecrosis (groundnut bud necrosis virus, GBNV) infecting tomato is a devastating viral pathogen responsible for severe yield losses of up to 100%. Considering the significance of the plant growth-promoting bacteria to induce innate immunity, attempts were made to evaluate the antiviral efficacy of Bacillus subtilis NMB01 against GBNV in cowpea and tomato. Foliar application of B. subtilis NMB01 at 1.5% onto the leaves of cowpea and tomato followed by challenge inoculation with GBNV significantly reduced the incidence of GBNV from 80% to 90% in response to the untreated inoculated control. Hence, we had a quest to understand if any genes were contributing toward the suppression of GBNV in assay hosts. To unveil the secrecy, whole-genome sequencing of B. subtilis NMB01 was carried out. The genome sequence of NMB01 revealed the presence of secondary metabolite biosynthetic gene clusters, including non-ribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs) which also encoded bacteriocins and antimicrobial peptides. The pan-genome analysis identified 1,640 core genes, 4,885 dispensable genes, and 60 unique genes, including MAMP genes that induce host immune responses. Comparative genome and proteome analysis with other genomes of B. subtilis strains in a public domain through OrthoVenn analysis revealed the presence of 4,241 proteins, 3,695 clusters, and 655 singletons in our study isolate. Furthermore, the NMB01-treated tomato plants increased the levels of defense-related genes (MAPKK1, WRKY33, PR1, PAL, and NPR1), enhancing immune system priming against GBNV infection. These findings suggest that B. subtilis NMB01 can be used as a promising biological control agent for managing plant viral disease sustainably.

RevDate: 2025-03-19

Vedrine E, Bessenay L, Philipponnet C, et al (2025)

Granulomatous nephropathy: have you thought about genetics?.

Pediatric nephrology (Berlin, Germany) [Epub ahead of print].

We report here the case of a 16-year-old girl with chronic kidney disease, where biopsy revealed tubulointerstitial nephropathy with granulomas. Initial treatments included immunosuppressive therapy unless genetic testing with exome sequencing identified nephronophthisis due to a homozygous deletion of the NPHP1 gene, marking a unique instance of granulomatous nephropathy related to nephronophthisis. With severe kidney damage, her function has not recovered, necessitating peritoneal dialysis and transplantation. This case highlights the need to consider nephronophthisis in inflammatory interstitial and granulomatous nephropathy, especially when it appears severe and early in life. In addition, it underscores the importance of genetic testing for accurate diagnosis and management in pediatric nephropathies.

RevDate: 2025-03-18

Chen YW, Su YC, Chen WY, et al (2025)

Comprehensive Genomic Analysis of Antimicrobial Resistance in Aeromonas dhakensis.

Microbial drug resistance (Larchmont, N.Y.) [Epub ahead of print].

Aeromonas dhakensis is prevalent in aquatic environments in Taiwan and known for its notable antimicrobial resistance. However, comprehensive pan-genomic studies for this species in Taiwan are limited. This study analyzed 28 clinical A. dhakensis isolates using single-molecule real-time sequencing technology, coupled with diverse databases, to elucidate the whole genomes. The focus was on phylogenetic relatedness, antimicrobial resistance genes, and mobile genetic elements. Genomic analysis and multilocus sequence typing were utilized to identify A. dhakensis strains of heterogeneous origins. The detection of various β-lactamase genes (blacphA, blaimiH, blaAQU, blaOXA, blaTEM-1, blaTRU-1, and blaVEB) in clinical A. dhakensis isolates raises concern, especially considering the use of carbapenems and third-generation cephalosporins in patients with severe infections. Notably, most A. dhakensis strains carry chromosome-encoded β-lactamases, including AmpC, metallo-β-lactamase, and oxacillinase, and were susceptible to cefepime in drug susceptibility tests. A. dhakensis strains were also susceptible to aminoglycosides, fluoroquinolones, tigecycline, and trimethoprim/sulfamethoxazole. Three of the 28 A. dhakensis isolates carried plasmids containing an array of drug resistance genes, suggesting this species is likely a recipient or donor of drug resistance genes through horizontal gene transfer. Our findings provide valuable insights into the antimicrobial resistance of A. dhakensis, highlighting the medical implications of its β-lactamase diversity and its potential role in the horizontal gene transfer of drug resistance genes.

RevDate: 2025-03-19
CmpDate: 2025-03-18

Liu R, Hu C, Gao D, et al (2025)

A special short-wing petal faba genome and genetic dissection of floral and yield-related traits accelerate breeding and improvement of faba bean.

Genome biology, 26(1):62.

BACKGROUND: A comprehensive study of the genome and genetics of superior germplasms is fundamental for crop improvement. As a widely adapted protein crop with high yield potential, the improvement in breeding and development of the seeds industry of faba bean have been greatly hindered by its giant genome size and high outcrossing rate.

RESULTS: To fully explore the genomic diversity and genetic basis of important agronomic traits, we first generate a de novo genome assembly and perform annotation of a special short-wing petal faba bean germplasm (VF8137) exhibiting a low outcrossing rate. Comparative genome and pan-genome analyses reveal the genome evolution characteristics and unique pan-genes among the three different faba bean genomes. In addition, the genome diversity of 558 accessions of faba bean germplasm reveals three distinct genetic groups and remarkable genetic differences between the southern and northern germplasms. Genome-wide association analysis identifies several candidate genes associated with adaptation- and yield-related traits. We also identify one candidate gene related to short-wing petals by combining quantitative trait locus mapping and bulked segregant analysis. We further elucidate its function through multiple lines of evidence from functional annotation, sequence variation, expression differences, and protein structure variation.

CONCLUSIONS: Our study provides new insights into the genome evolution of Leguminosae and the genomic diversity of faba bean. It offers valuable genomic and genetic resources for breeding and improvement of faba bean.

RevDate: 2025-03-17

Olbrich J, Büchler T, E Ohlebusch (2025)

Generating Multiple Alignments on a Pangenomic Scale.

Bioinformatics (Oxford, England) pii:8082102 [Epub ahead of print].

MOTIVATION: Since novel long read sequencing technologies allow for de novo assembly of many individuals of a species, high-quality assemblies are becoming widely available. For example, the recently published draft human pangenome reference was based on assemblies composed of contigs. There is an urgent need for a software-tool that is able to generate a multiple alignment of genomes of the same species because current multiple sequence alignment programs cannot deal with such a volume of data.

RESULTS: We show that the combination of a well-known anchor-based method with the technique of prefix-free parsing yields an approach that is able to generate multiple alignments on a pangenomic scale, provided that large-scale structural variants are rare. Furthermore, experiments with real world data show that our software tool PANAMA (PANgenomic Anchor-based Multiple Alignment) significantly outperforms current state-of-the art programs.

AVAILABILITY: Source code is available at: https://gitlab.com/qwerzuiop/panama, archived at swh  :  1: dir: e90c9f664995acca9063245cabdd97549cf39694.

RevDate: 2025-03-17

Pedrozo R, Osakina A, Huang Y, et al (2025)

Status on Genetic Resistance to Rice Blast Disease in the Post-Genomic Era.

Plants (Basel, Switzerland), 14(5):.

Rice blast, caused by Magnaporthe oryzae, is a major threat to global rice production, necessitating the development of resistant cultivars through genetic improvement. Breakthroughs in rice genomics, including the complete genome sequencing of japonica and indica subspecies and the availability of various sequence-based molecular markers, have greatly advanced the genetic analysis of blast resistance. To date, approximately 122 blast-resistance genes have been identified, with 39 of these genes cloned and molecularly characterized. The application of these findings in marker-assisted selection (MAS) has significantly improved rice breeding, allowing for the efficient integration of multiple resistance genes into elite cultivars, enhancing both the durability and spectrum of resistance. Pangenomic studies, along with AI-driven tools like AlphaFold2, RoseTTAFold, and AlphaFold3, have further accelerated the identification and functional characterization of resistance genes, expediting the breeding process. Future rice blast disease management will depend on leveraging these advanced genomic and computational technologies. Emphasis should be placed on enhancing computational tools for the large-scale screening of resistance genes and utilizing gene editing technologies such as CRISPR-Cas9 for functional validation and targeted resistance enhancement and deployment. These approaches will be crucial for advancing rice blast resistance, ensuring food security, and promoting agricultural sustainability.

RevDate: 2025-03-18

Pacce VD, Guimarães AM, Kremer FS, et al (2025)

Integrated Bioinformatics Analysis for Target Identification and Evaluation of Recombinant Protein as an Antigen for Intradermal Skin Test in Bovine Tuberculosis Diagnosis.

ACS omega, 10(9):9187-9196.

Bovine tuberculosis (bTB) is a respiratory disease caused by Mycobacterium bovis, posing a significant threat to animal health and the livestock industry. Current control strategies for bTB rely on diagnostic tests and slaughter policies. However, the limitations of existing diagnostic methods, which depend on PPD antigens, necessitate the exploration of alternative antigens to enhance the accuracy and reliability of bTB diagnosis. This study aimed to identify, produce, and evaluate novel antigens for use in the intradermal skin test for bTB diagnosis. A pangenome analysis of four Mycobacterium species identified 12 unique genes specific to M. bovis SP38. Further integrated bioinformatic analysis revealed 224 genomic islands associated with virulence and pathogenesis. Among these, a highly antigenic protein, termed HP28, was selected for in vivo testing. The recombinant HP28 protein (rHP28) was expressed in E. coli and assessed for its ability to induce intradermal skin reactions in guinea pigs. The rHP28 protein elicited a skin reaction of 6.6 mm at 72 h post-injection, whereas negative controls showed no reaction. This study presents a pipeline for the selection of antigens using integrated bioinformatic analysis to identify diagnostic targets that can effectively distinguish between sensitized and non-sensitized animals, offering a promising approach for improving bTB diagnostics.

RevDate: 2025-03-17

Zytnicki M (2025)

Assessing genome conservation on pangenome graphs with PanSel.

Bioinformatics advances, 5(1):vbaf018.

MOTIVATION: With more and more telomere-to-telomere genomes assembled, pangenomes make it possible to capture the genomic diversity of a species. Because they introduce less biases, pangenomes, represented as graphs, tend to supplant the usual linear representation of a reference genome, augmented with variations. However, this major change requires new tools adapted to this data structure. Among the numerous questions that can be addressed to a pangenome graph is the search for conserved or divergent genes.

RESULTS: In this article, we present a new tool, named PanSel, which computes a conservation score for each segment of the genome, and finds genomic regions that are significantly conserved, or divergent. PanSel can be used on prokaryotes and eukaryotes, with a sequence identity not less than 98%.

PanSel, written in C++11 with no dependency, is available at https://github.com/mzytnicki/pansel.

RevDate: 2025-03-16
CmpDate: 2025-03-16

Mukhopadhya I, Martin JC, Shaw S, et al (2025)

Novel insights into carbohydrate utilisation, antimicrobial resistance, and sporulation potential in Roseburia intestinalis isolates across diverse geographical locations.

Gut microbes, 17(1):2473516.

Roseburia intestinalis is one of the most abundant and important butyrate-producing human gut anaerobic bacteria that plays an important role in maintaining health and is a potential next-generation probiotic. We investigated the pangenome of 16 distinct strains, isolated over several decades, identifying local and time-specific adaptations. More than 50% of the genes in each individual strain were assigned to the core genome, and 77% of the cloud genes were unique to individual strains, revealing the high level of genome conservation. Co-carriage of the same enzymes involved in carbohydrate binding and degradation in all strains highlighted major pathways in carbohydrate utilization and reveal the importance of xylan, starch and mannose as key growth substrates. A single strain had adapted to use rhamnose as a sole growth substrate, the first time this has been reported. The ubiquitous presence of motility and sporulation gene clusters demonstrates the importance of these phenotypes for gut survival and acquisition of this bacterium. More than half the strains contained functional, potentially transferable, tetracycline resistance genes. This study advances our understanding of the importance of R. intestinalis within the gut ecosystem by elucidating conserved metabolic characteristics among different strains, isolated from different locations. This information will help to devise dietary strategies to increase the abundance of this species providing health benefits.

RevDate: 2025-03-15

Simon V, Trouillon J, Attrée I, et al (2025)

Functional and Pangenomic Exploration of Roc Two-Component Regulatory Systems Identifies Novel Players Across Pseudomonas Species.

Molecular microbiology [Epub ahead of print].

The opportunistic pathogen Pseudomonas aeruginosa relies on a large collection of two-component regulatory systems (TCSs) to sense and adapt to changing environments. Among them, the Roc (regulation of cup) system is a one-of-a-kind network of branched TCSs, composed of two histidine kinases (HKs-RocS1 and RocS2) interacting with three response regulators (RRs-RocA1, RocR, and RocA2), which regulate virulence, antibiotic resistance, and biofilm formation. Based on extensive work on the Roc system, previous data suggested the existence of other key regulators yet to be discovered. In this work, we identified PA4080, renamed RocA3, as a fourth RR that is activated by RocS1 and RocS2 and that positively controls the expression of the cupB operon. Comparative genomic analysis of the locus identified a gene-rocR3-adjacent to rocA3 in a subpopulation of strains that encodes a protein with structural and functional similarity to the c-di-GMP phosphodiesterase RocR. Furthermore, we identified a fourth branch of the Roc system consisting of the PA2583 HK, renamed RocS4, and the Hpt protein HptA. Using a bacterial two-hybrid system, we showed that RocS4 interacts with HptA, which in turn interacts with RocA1, RocA2, and RocR3. Finally, we mapped the pangenomic RRs repertoire, establishing a comprehensive view of the plasticity of such regulators among clades of the species. Overall, our work provides a comprehensive inter-species definition of the Roc system, nearly doubling the number of proteins known to be involved in this interconnected network of TCSs controlling pathogenicity in Pseudomonas species.

RevDate: 2025-03-14
CmpDate: 2025-03-14

Kumari K, Sinha A, Sharma PK, et al (2025)

In-depth genome and comparative genome analysis of a metal-resistant environmental isolate Pseudomonas aeruginosa S-8.

Frontiers in cellular and infection microbiology, 15:1511507.

The present study aimed to identify the mechanisms underlying the survival of an environmental bacterium originally isolated from the waste-contaminated soil of Jhiri, Ranchi, India. Based on 16S rRNA, ANI (average nucleotide identity), and BLAST Ring Image Generator (BRIG) analysis, the isolated strain was identified as Pseudomonas aeruginosa. The present study extends the characterization of this bacterium through genomic and comparative genomic analysis to understand the genomic features pertaining to survival in stressed environments. The sequencing of the bacterium at Illumina HiSeq platform revealed that it possessed a 6.8 Mb circular chromosome with 65.9% GC content and 63 RNAs sequence. The genome also harbored several genes associated to plant growth promotion i.e. phytohormone and siderophore production, phosphate solubilization, motility, and biofilm formation, etc. The genomic analysis with online tools unraveled the various genes belonging to the bacterial secretion system, antibiotic resistance, virulence, and efflux pumps, etc. The presence of biosynthetic gene clusters (BCGs) indicated that large numbers of genes were associated to non-ribosomal synthesized peptide synthetase, polyketide synthetase, and other secondary metabolite production. Additionally, its genomes encode various CAZymes such as glycoside hydrolases and other genes associated with lignocellulose breakdown, suggesting that strain S-8 have strong biomass degradation potential. Furthermore, pan-genome analysis based on a comparison of whole genomes showed that core genome represented the largest part of the gene pools. Therefore, genome and comparative genome analysis of Pseudomonas strains is valuable for understanding the mechanism of resistance to metal stress, genome evolution, HGT events, and therefore, opens a new perspective to exploit a newly isolated bacterium for biotechnological applications.

RevDate: 2025-03-14

Shi G, Dai Y, Zhou D, et al (2025)

An alignment- and reference-free strategy using k-mer present pattern for population genomic analyses.

Mycology, 16(1):309-323.

Pangenomes are replacing single reference genomes to capture all variants within a species or clade, but their analysis predominantly leverages graph-based methods that require multiple high-quality genomes and computationally intensive multiple-genome alignments. K-mer decomposition is an alternative to graph-based pangenomes. However, how to directly use k-mers for the population genetic analyses is unknown. Here, we developed a novel strategy that uses the variants of k-mer count in the genome for population analyses. To test the effectivity of this method, we compared it directly to the SNP-based method on the analysis of population structure and genetic diversity of 267 Saccharomyces cerevisiae strains within two simulated datasets and a real sequence dataset. The population structure identified with k-mers recapitulates that obtained using SNPs, indicating the effectiveness of k-mer-based approach, and higher genetic diversity within real dataset supported k-mers contained more genetic variants. Based on k-mer frequency, we found not only SNP but also some insertion/deletion and horizontal gene transfer (HGT) fragments related to the adaptive evolution of S. cerevisiae. Our study creates a framework for the alignment- and reference-free (ARF) method in population genetic analyses, which will be more pronounced in the species with no complete genome or highly diverged species.

RevDate: 2025-03-15

Arend M, Paulitz E, Hsieh YE, et al (2025)

Scaling metabolic model reconstruction up to the pan-genome level: A systematic review and prospective applications to photosynthetic organisms.

Metabolic engineering, 90:67-77 pii:S1096-7176(25)00028-X [Epub ahead of print].

Advances in genomics technologies have generated large data sets that provide tremendous insights into the genetic diversity of taxonomic groups. However, it remains challenging to pinpoint the effect of genetic diversity on different traits without performing resource-intensive phenotyping experiments. Pan-genome-scale metabolic models (panGEMs) extend traditional genome-scale metabolic models by considering the entire reaction repertoire that enables the prediction and comparison of metabolic capabilities within a taxonomic group. Here, we systematically review the state-of-the-art methodologies for constructing panGEMs, focusing on used tools, databases, experimental datasets, and orthology relationships. We highlight the unique advantages of panGEMs compared to single-species GEMs in predicting metabolic phenotypes and in guiding the experimental validation of genome annotations. In addition, we emphasize the disparity between the available (pan-)genomic data on photosynthetic organisms and their under-representation in current (pan)GEMs. Finally, we propose a perspective for tackling the reconstruction of panGEMs for photosynthetic eukaryotes that can help advance our understanding of the metabolic diversity in this taxonomic group.

RevDate: 2025-03-13
CmpDate: 2025-03-13

Guo Y, Han Y, Gao J, et al (2025)

Rapid Identification of Alien Chromosome Fragments and Tracing of Bioactive Compound Genes in Intergeneric Hybrid Offspring Between Brassica napus and Isatis indigotica Based on AMAC Method.

International journal of molecular sciences, 26(5):.

Distant hybridization between Brassica napus and related genera serves as an effective approach for rapeseed germplasm innovation. Isatis indigotica, a wild relative of Brassica, has emerged as a valuable genetic resource for rapeseed improvement due to its medicinal properties. This study employed anchor mapping of alien chromosomal fragment localization (AMAC) method to efficiently identify alien chromosomal fragments in the progeny derived from distant hybridization between I. indigotica and Brassica napus, 'Songyou No. 1'. Based on the AMAC method, we developed 193,101 IP and SSR markers utilizing the I. indigotica reference genome (Woad-v1.0). Through Electronic-PCR analysis against the Brassica and I. indigotica pan-genome, 27,820 specific single-locus (SSL) IP and SSR markers were obtained. Subsequently, 205 pairs of IP primers and 50 pairs of SSR primers were synthesized randomly, among which 148 pairs of IP markers (72.20%) and 45 pairs of SSR markers (90%) were verified as SSL molecular markers for the I. indigotica genome with no amplification product in four Brassica crops. These 193 SSL markers enable precise identification of one complete I6 chromosome and three chromosomal fragments (I1:1.17 Mb, I5:2.61 Mb, I7:1.11 Mb) in 'Songyou No. 1'. Furthermore, we traced 32 genes involved in bioactive compound biosynthesis within/near these alien segments in 'Songyou No. 1' and developed seven functional markers. This study not only validates the efficacy of SSL markers for detecting exogenous chromatin in intergeneric hybrids but also provides valuable insights for the precise identification and mapping of desired chromosomal fragments or genes embedded in the derivatives from distant hybridization and potential applications in marker-assisted breeding for medicinal plant via distant hybridization strategy between I. indigotica and Brassica crops.

RevDate: 2025-03-13

Morales-Díaz N, Sushko S, Campos-Dominguez L, et al (2025)

Tandem LTR-retrotransposon structures are common and highly polymorphic in plant genomes.

Mobile DNA, 16(1):10.

BACKGROUND: LTR-retrotransposons (LTR-RT) are a major component of plant genomes and important drivers of genome evolution. Most LTR-RT copies in plant genomes are defective elements found as truncated copies, nested insertions or as part of more complex structures. The recent availability of highly contiguous plant genome assemblies based on long-read sequences now allows to perform detailed characterization of these complex structures and to evaluate their importance for plant genome evolution.

RESULTS: The detailed analysis of two rice loci containing complex LTR-RT structures showed that they consist of tandem arrays of LTR copies sharing internal LTRs. Our analyses suggests that these LTR-RT tandems are the result of a single insertion and not of the recombination of two independent LTR-RT elements. Our results also suggest that gypsy elements may be more prone to form these structures. We show that these structures are highly polymorphic in rice and therefore have the potential to generate genetic variability. We have developed a computational pipeline (IDENTAM) that scans genome sequences and identifies tandem LTR-RT candidates. Using this tool, we have detected 266 tandems in a pangenome built from the genomes of 76 accessions of cultivated and wild rice, showing that tandem LTR-RT structures are frequent and highly polymorphic in rice. Running IDENTAM in the Arabidopsis, almond and cotton genomes showed that LTR-RT tandems are frequent in plant genomes of different size, complexity and ploidy level. The complexity of differentiating intra-element variations at the nucleotide level among haplotypes is very high, and we found that graph-based pangenomic methodologies are appropriate to resolve these structures.

CONCLUSIONS: Our results show that LTR-RT elements can form tandem arrays. These structures are relatively abundant and highly polymorphic in rice and are widespread in the plant kingdom. Future studies will contribute to understanding how these structures originate and whether the variability that they generate has a functional impact.

RevDate: 2025-03-13
CmpDate: 2025-03-13

Sharma J, Jangale V, Shekhawat RS, et al (2025)

Improving genetic variant identification for quantitative traits using ensemble learning-based approaches.

BMC genomics, 26(1):237.

BACKGROUND: Genome-wide association studies (GWAS) are rapidly advancing due to the improved resolution and completeness provided by Telomere-to-Telomere (T2T) and pangenome assemblies. While recent advancements in GWAS methods have primarily focused on identifying genetic variants associated with discrete phenotypes, approaches for quantitative traits (QTs) remain underdeveloped. This has often led to significant variants being overlooked due to biases from genotype multicollinearity and strict p-value thresholds.

RESULTS: We propose an enhanced ensemble learning approach for QT analysis that integrates regularized variant selection with machine learning-based association methods, validated through comprehensive biological enrichment analysis. We benchmarked four widely recognized single nucleotide polymorphism (SNP) feature selection methods-least absolute shrinkage and selection operator, ridge regression, elastic-net, and mutual information-alongside four association methods: linear regression, random forest, support vector regression (SVR), and XGBoost. Our approach is evaluated on simulated datasets and validated using a subset of the PennCATH real dataset, including imputed versions, focusing on low-density lipoprotein (LDL)-cholesterol levels as a QT. The combination of elastic-net with SVR outperformed other methods across all datasets. Functional annotation of top 100 SNPs identified through this superior ensemble method revealed their expression in tissues involved in LDL cholesterol regulation. We also confirmed the involvement of six known genes (APOB, TRAPPC9, RAB2A, CCL24, FCHO2, and EEPD1) in cholesterol-related pathways and identified potential drug targets, including APOB, PTK2B, and PTPN12.

CONCLUSIONS: In conclusion, our ensemble learning approach effectively identifies variants associated with QTs, and we expect its performance to improve further with the integration of T2T and pangenome references in future GWAS.

RevDate: 2025-03-12

Pragasam AK, Maurya S, Jain K, et al (2025)

Invasive Salmonella Typhimurium Colonizes Gallbladder and Contributes to Gallbladder Carcinogenesis through Activation of Host Epigenetic Modulator KDM6B.

Cancer letters pii:S0304-3835(25)00185-5 [Epub ahead of print].

Gallbladder stones alone do not explain the risk of gallbladder cancer (GBC) as the sole etiological factor. Chronic microbial infection, particularly Salmonella, has been implicated in GB carcinogenesis, but its causative role and the underlying mechanisms are largely unknown. We studied gut and gallbladder tissue microbiome through targeted metagenomics to identify pathogenic bacteria in GBC. Virulence and pathogenicity of identified Salmonella Typhimurium from GBC tissue were studied after culture by whole genome sequencing, phylogenetic analysis, mutational profiling, and pangenome analysis. Mechanistic studies for GBC carcinogenesis were carried out in a mouse model of gallstones and chronic Salmonella infection, a cellular model using GBC (NOZ) cell lines, and a xenograft tumor model. We found an increased abundance of Salmonella in the gut microbiome of patients with GBC and culturable S. Typhimurium from the gallbladder cancer tissue. Comparative genomics of S. Typhimurium isolated from the GBC tissue showed a high invasive index. S. Typhimurium isolates harbored horizontally acquired virulence functions in their accessory genome. Chronic S. Typhimurium infection caused chronic inflammation, pre-malignant changes, and tumor-promoting mechanisms in the mouse model with gallbladder stones with activation of the epigenetic modulator KDM6B both in the mouse model and human GBC. Inhibition of KDM6B reduced engrafted tumor size in SCID mice. Of the differentially regulated genes in human GBC tissue, ADAMTSL5, CX3CR1, and SPSB4 were also significantly dysregulated in NOZ cells infected with Salmonella. Chronic Salmonella infection contributes to gallbladder carcinogenesis through a host epigenetic mechanism involving KDM6B.

RevDate: 2025-03-12

Thomson RM, Wheeler N, Stockwell RE, et al (2025)

Infection by Clonally Related Mycobacterium abscessus Isolates: The Role of Drinking Water.

American journal of respiratory and critical care medicine [Epub ahead of print].

RATIONALE: Mycobacterium abscessus group bacteria (MABS) cause lethal infections in people with chronic lung diseases. Transmission mechanisms remain poorly understood; the detection of dominant circulating clones (DCCs) has suggested potential for person-to-person transmission.

OBJECTIVES: This study aimed to determine the role of drinking water in the transmission of MABS.

METHODS: A total of 289 isolates were cultured from respiratory samples (231) and drinking water sources (58) across Queensland, Australia.

MEASUREMENTS AND MAIN RESULTS: Whole genome sequences were analysed to identify DCCs and determine relatedness. Half of the isolates (144, 49·8%) clustered with previously described DCCs, of which 30 formed a clade within DCC5. Pangenomic analysis of the water-associated DCC5 clade revealed an enrichment of genes associated with copper resistance. Four instances of plausible epidemiological links were identified between genomically-related clinical and water isolates.

CONCLUSIONS: We provide evidence that drinking water is a reservoir for MABS and may be a vector in the chain of MABS infection.

RevDate: 2025-03-12

Marla SR, Olatoye M, Davis M, et al (2025)

Mining sorghum pangenome enabled identification of new dw3 alleles for breeding stable-dwarfing hybrids.

G3 (Bethesda, Md.) pii:8071393 [Epub ahead of print].

Allele mining of crop pangenomes can enable the identification of novel variants for trait improvement, increase crop genetic diversity, and purge deleterious mutations around fixed genomic regions. Sorghum, a C4 cereal crop domesticated in the tropics, was selected for reduced plant height and maturity to develop combine-harvestable and photoperiod-insensitive US grain sorghums. To breed semi-dwarf US grain sorghum hybrids, public and private sector programs mostly used the dw3-ref allele, which produces undesirable height revertants (frequency of 0.1-0.3%) due to uneven crossing over at the 882 bp tandem duplication region. Although the dw3-ref allele produces revertants, US sorghum breeding programs continued using this allele in the absence of identified allelic variants that suppress revertants. In this study, we leveraged a sorghum pangenome resource (resequenced sorghum association panel and a global diversity panel of 1661 lines) to identify seven loss-of-function variants in the Dw3 gene using the SnpEff variant calling prediction. We validated the Segaolane dw3 loss-of-function variant, resulting from a 137 bp deletion in the third exon, to suppress revertant production. Segaolane NAM family RILs with the dw3-ref allele produced revertants while no revertants were observed in RILs with the Segaolane dw3 allele. The availability of resequencing data enabled the designing of haplotype-based markers detecting the Segaolane stable dw3 allele for marker-assisted trait introgression into elite sorghum breeding lines. This research mining new stable-dwarfing dw3 alleles demonstrated the application of sorghum pan-genome for trait improvement and developing marker-assisted breeding strategies.

RevDate: 2025-03-12
CmpDate: 2025-03-12

Huang YX, Rao HY, Su BS, et al (2025)

The pan-genome of Spodoptera frugiperda provides new insights into genome evolution and horizontal gene transfer.

Communications biology, 8(1):407.

Spodoptera frugiperda is a common and severely damaging agricultural pest. In-depth analysis of its population genomics and transcriptomics is crucial for providing references for pest control efforts. This study, focused on the extensive variation in the genome size of S. frugiperda, constructed its pan-genome and identified 1.37 Gb of non-reference sequences, highlighting significant genetic variation within the population. Analysis of Long Terminal Repeat (LTR) Presence/Absence Variation (PAV) suggests that LTR alterations may be one of the driving factors for genome size variation. Additionally, population gene PAV analysis revealed that variable genes are enriched in functions like acetyltransferase activity, which might be associated with detoxification, implying diverse selection pressures related to detoxification in different S. frugiperda populations. Moreover, 19 horizontal gene transfer (HGT) acquired genes were identified in the reference genome used in this study, which responded to 16 different treatments. Notably, three HGT-acquired genes (SFR02618, SFR05248, and SFR05249) co-expressed with heat shock protein family and responded under treatments with Avermectin and Cypermethrin. This may indicate their involvement in a detoxification mechanism coordinated with heat shock proteins. These results offering new insights into its genomic evolution and the potential functions of HGT-acquired genes.

RevDate: 2025-03-11

Ahsan T, Mahnoor , Alharbi SA, et al (2025)

Genome Mining and Antagonism of Stenotrophomonas geniculata MK-1, Against Peanut Foliage Fungus Diseases.

Journal of basic microbiology [Epub ahead of print].

Stenotrophomonas geniculata, a bacterium, has been recognized as an eco-friendly substitute for chemical fungicides in managing peanut foliar diseases, web blotch, and early leaf spot. Core genome and pan-genome analysis identified that strain MK-1 belongs to Stenotrophomonas geniculata, and nucleotide polymorphism (SNP) analysis confirmed that strain belongs to Stenotrophomonas maltophilia. The research revealed that S. geniculata MK-1 had a notable antagonistic impact on Peyronellaea arachidicola and Cercospora arachidicola and demonstrated a biocontrol efficacy of over 95% against peanut early leaf spot and web blotch disease. The nonredundant protein sequences (NR) database identified 4324 annotations related to S. geniculata, with 2682 genes similar to strain MK-1. The COG database categorized 3041 annotations into 22 functional groups, and 33 distinct metabolic pathways associated with 1851 Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations. Most genes linked with metabolism are found in S. geniculata, with 380 genes related to carbohydrate metabolism and 44 genes related to secondary metabolite biosynthesis. The Carbohydrate-Active enZYmes (CAZy) database identified 194 annotations are linked to non-ribosomal synthesis of secondary metabolites. The Pathogen-Host Interactions (PHI) database showed reduced virulence in strain MK-1, while unaffected pathogenicity protein counts were 52. The MK-1 strain can produce antifungal siderophores secondary metabolites, non-ribosomal peptide synthetase (NRPS), and siderophores.

RevDate: 2025-03-11
CmpDate: 2025-03-10

Manoharan R, Nair CS, Nishanth D, et al (2025)

Crop Wild Relatives (CWRs) in the United Arab Emirates: Resources for Climate Resilience and Their Potential Medicinal Applications.

Drug design, development and therapy, 19:1515-1525.

Global climate change threatens the production, growth, and sustainability of plants. Crop wild relatives (CWRs) offer a practical and sustainable solution to these climatic issues by boosting genetic diversity and crop resilience. Even though CWRs are wild relatives of domesticated plants, they are nevertheless mostly neglected. This review focuses on the possible application of CWRs, which are found in the United Arab Emirates (UAE) and are known for their abiotic stress tolerance and potential medicinal properties. In olden days, traditionally, CWRs has been used as medicine for various ailments as they are rich in phytochemical compounds. However, the medicinal potential of these wild plant species is decreasing at an alarming rate due to climate change stress factors. The medicinal potential of these native crop wild plant species must be investigated because they could be a useful asset in the healthcare sector. Research on pangenomics studies of certain CWRs is also highlighted in the review, which reveals genetic variability caused due to climate change stress factors and how these genetic variability changes affect the production of secondary metabolites that have potent medicinal value. This provides insights into developing personalized medicine, in which particular CWRs plant species can be chosen or modified to generate medicinal compounds. Despite their superior medicinal properties, many CWRs in the UAE are still not well understood. Finding the desired genes coding for the biosynthesis of specific phytochemicals or secondary metabolites may help us better understand how these substances are synthesized and how to increase their production for a range of treatments.

RevDate: 2025-03-10

Zebell SG, Martí-Gómez C, Fitzgerald B, et al (2025)

Cryptic variation fuels plant phenotypic change through hierarchical epistasis.

bioRxiv : the preprint server for biology pii:2025.02.23.639722.

Cryptic genetic variants exert minimal or no phenotypic effects alone but have long been hypothesized to form a vast, hidden reservoir of genetic diversity that drives trait evolvability through epistatic interactions. This classical theory has been reinvigorated by pan-genome sequencing, which has revealed pervasive variation within gene families and regulatory networks, including extensive cis-regulatory changes, gene duplication, and divergence between paralogs. Nevertheless, empirical testing of cryptic variation's capacity to fuel phenotypic diversification has been hindered by intractable genetics, limited allelic diversity, and inadequate phenotypic resolution. Here, guided by natural and engineered cis-regulatory cryptic variants in a recently evolved paralogous gene pair, we identified an additional pair of redundant trans regulators, establishing a regulatory network that controls tomato inflorescence architecture. By combining coding mutations with a cis-regulatory allelic series in populations segregating for all four network genes, we systematically constructed a collection of 216 genotypes spanning the full spectrum of inflorescence complexity and quantified branching in over 27,000 inflorescences. Analysis of the resulting high-resolution genotype-phenotype map revealed a layer of dose-dependent interactions within paralog pairs that enhances branching, culminating in strong, synergistic effects. However, we also uncovered an unexpected layer of antagonism between paralog pairs, where accumulating mutations in one pair progressively diminished the effects of mutations in the other. Our results demonstrate how gene regulatory network architecture and complex dosage effects from paralog diversification converge to shape phenotypic space under a hierarchical model of epistatic interactions. Given the prevalence of paralog evolution in genomes, we propose that paralogous cryptic variation within regulatory networks elicits hierarchies of epistatic interactions, catalyzing bursts of phenotypic change. Keyword: cryptic mutations, paralogs, redundancy, cis-regulatory, tomato, inflorescence, gene regulatory network, modeling, epistasis.

RevDate: 2025-03-10

Depuydt L, Ahmed OY, Fostier J, et al (2025)

Run-length compressed metagenomic read classification with SMEM-finding and tagging.

bioRxiv : the preprint server for biology pii:2025.02.25.640119.

Metagenomic read classification is a fundamental task in computational biology, yet it remains challenging due to the scale, diversity, and complexity of sequencing datasets. We propose a novel, lossless, run-length compressed index that enables efficient multi-class metagenomic classification in O (r) space, based on the move structure. Our method identifies all super-maximal exact matches (SMEMs) of length at least L between a read and the reference dataset and associates each SMEM with one class identifier using a sampled tag array. A consensus algorithm then compacts these SMEMs with their class identifier into a single classification per read. We are the first to perform run-length compressed read classification based on full SMEMs instead of semi-SMEMs. We evaluate our approach on both long and short reads in two conceptually distinct datasets: a large bacterial pan-genome with few metagenomic classes and a smaller 16S rRNA gene database spanning thousands of genera or classes. Our method consistently outperforms SPUMONI 2 in accuracy and runtime, with only a modest memory overhead. Compared to Cliffy, we demonstrate better memory efficiency while achieving superior accuracy on the simpler dataset and comparable performance on the more complex one. Overall, our implementation carefully balances accuracy, runtime, and memory usage, offering a versatile solution for metagenomic classification across diverse datasets. The open-source C++11 implementation is available at https://github.com/biointec/tagger under the AGPL-3.0 license.

RevDate: 2025-03-10

Yang Q, Sun Y, Duan S, et al (2025)

High-quality Population-specific Haplotype-resolved Reference Panel in the Genomic and Pangenomic Eras.

Genomics, proteomics & bioinformatics pii:8064792 [Epub ahead of print].

Large-scale international and regional human genomic and pangenomic resources derived from population-scale biobanks and ancient DNA sequences have provided significant insights into human evolution and the genetic determinants of complex diseases and traits. Despite these advances, challenges persist in optimizing the integration of phasing tools, merging haplotype reference panels (HRPs), developing imputation algorithms, and fully exploiting the diverse applications of post-imputation data. This review comprehensively summarizes the advancements, applications, limitations, and future directions of HRPs in human genomics research. Recent progress in the reconstruction of HRPs, based on over 830,000 human whole-genome sequences, has been synthesized, highlighting the broad spectrum of human genetic diversity captured. Additionally, we recapitulate advancements in fifty-six HRPs for global and regional populations. The evaluation of imputation accuracy indicated that Beagle and GLIMPSE are the most effective tools for phasing and imputing data from genotyping arrays and low-coverage sequencing, respectively. A critical strategy for selecting an appropriate HRP involves matching the population background of target groups with HRP reference populations and considering multi-ancestry or homogeneous genetic structures. The necessity of a single, integrative, high-quality HRP that captures haplotype structures and genetic diversity across various genetic variation types from globally representative populations is emphasized to support both modern and ancient genomic research and advance human precision medicine.

RevDate: 2025-03-10

Liu W, P Cui (2025)

Haplotype-based Pangenomics: A Blueprint for Climate Adaptation in Plants.

Genomics, proteomics & bioinformatics pii:8064791 [Epub ahead of print].

RevDate: 2025-03-08

Zhang R, Dai C, Gong R, et al (2025)

Gapless genome assembly and pan-genome of Brassica juncea provide insights into seed quality improvement and environmental adaptation.

Plant communications pii:S2590-3462(25)00060-4 [Epub ahead of print].

RevDate: 2025-03-07

Palma-Martínez MJ, Posadas-García YS, Shaukat A, et al (2025)

Evolution, genetic diversity, and health.

Nature medicine [Epub ahead of print].

Human genetic diversity in today's world has been shaped by evolutionary history, demographic shifts and environmental exposures, influencing complex traits, disease susceptibility and drug responses. Capturing this diversity is essential for advancing precision medicine and promoting equitable healthcare. Despite the great progress achieved with initiatives such as the human Pangenome and large biobanks that aim for a better representation of human diversity, important challenges remain. In this Perspective, we discuss the importance of diversity in clinical genomics through an evolutionary lens. We highlight progress and challenges and outline key clinical applications of diverse genetic data. We argue that diversifying both datasets and methodologies-integrating ancestral and environmental factors-is crucial for fully understanding the genetic basis of human health and disease.

RevDate: 2025-03-07

Nagasaki M, Hirayasu K, Khor SS, et al (2025)

JoGo-LILR caller: Unveiling and navigating the complex diversity of LILRB3-LILRA6 copy number haplotype structures with whole-genome sequencing.

Human immunology, 86(3):111272 pii:S0198-8859(25)00043-6 [Epub ahead of print].

Leukocyte immunoglobulin-like receptors (LILRs), encoded on human chromosome 19q13.4, comprise a set of 11 immunoglobulin superfamily receptors known for their genetic heterogeneity. Notably, LILRB3 and LILRA6 within this cluster exhibit pronounced sequence homology in immunoglobulin-like domains involved in ligand binding and variable copy number (CN) states. However, understanding their precise role remains challenging. To address this difficulty, we developed an algorithm and tool named JoGo-LILR Caller, which jointly calls CNs of LILRB3 and LILRA6 from a population-scale whole-genome short-read sequencing dataset. This tool was applied to 2,504 international HapMap samples and yielded a global CN profile. The 100 % concordance rate corroborated this profile with the CN data obtained from 40 samples of pangenome reference assemblies provided by the Human Pangenome Reference Consortium (HPRC). The frequencies of LILRB3-LILRA6 CN haplotype structures were also estimated for five continental groups with a global CN profile. The established allele frequency profile allowed our tool to estimate LILRB3-LILRA6 CN haplotype combinations. JoGo-LILR-trio enhanced the prediction reliability for haplotype pairs within trio datasets, with trio analysis on 40 child samples demonstrating a 100 % concordance between the predicted pair of haploid CN types and the diploid reference assemblies. Its utility will extend to facilitating software advancements for imputing LILRB3-LILRA6 CN types from SNP array genotyping data, enabling subsequent association analyses that link these CN types to diverse phenotypic traits and diseases, e.g., inflammatory bowel diseases and Takayasu arteritis.

RevDate: 2025-03-09
CmpDate: 2025-03-07

Brant E, Zuniga-Soto E, F Altpeter (2025)

RNAi and genome editing of sugarcane: Progress and prospects.

The Plant journal : for cell and molecular biology, 121(5):e70048.

Sugarcane, which provides 80% of global table sugar and 40% of biofuel, presents unique breeding challenges due to its highly polyploid, heterozygous, and frequently aneuploid genome. Significant progress has been made in developing genetic resources, including the recently completed reference genome of the sugarcane cultivar R570 and pan-genomic resources from sorghum, a closely related diploid species. Biotechnological approaches including RNA interference (RNAi), overexpression of transgenes, and gene editing technologies offer promising avenues for accelerating sugarcane improvement. These methods have successfully targeted genes involved in important traits such as sucrose accumulation, lignin biosynthesis, biomass oil accumulation, and stress response. One of the main transformation methods-biolistic gene transfer or Agrobacterium-mediated transformation-coupled with efficient tissue culture protocols, is typically used for implementing these biotechnology approaches. Emerging technologies show promise for overcoming current limitations. The use of morphogenic genes can help address genotype constraints and improve transformation efficiency. Tissue culture-free technologies, such as spray-induced gene silencing, virus-induced gene silencing, or virus-induced gene editing, offer potential for accelerating functional genomics studies. Additionally, novel approaches including base and prime editing, orthogonal synthetic transcription factors, and synthetic directed evolution present opportunities for enhancing sugarcane traits. These advances collectively aim to improve sugarcane's efficiency as a crop for both sugar and biofuel production. This review aims to discuss the progress made in sugarcane methodologies, with a focus on RNAi and gene editing approaches, how RNAi can be used to inform functional gene targets, and future improvements and applications.

RevDate: 2025-03-06
CmpDate: 2025-03-06

Chacón RD, Ramírez M, Suárez-Agüero D, et al (2025)

Genomic Differences in Antimicrobial Resistance and Virulence Among Key Salmonella Strains of Serogroups B and D1 in Brazilian Poultry.

Current microbiology, 82(4):173.

Salmonella is a significant threat to Brazilian poultry, causing economic losses and public health risks. This study analyzed 15 Salmonella isolates along with 45 retrieved complete genomes, including serovars Gallinarum, Pullorum, Enteritidis, Typhimurium, and Heidelberg. Biochemical characterization, antimicrobial susceptibility testing, whole-genome sequencing, and comparative genomics were performed. The studied strains exhibited high levels of antimicrobial resistance, particularly to tilmicosin, penicillin/novobiocin, nalidixic acid, and streptomycin. Genomic analysis revealed diverse virulence factors and antibiotic resistance genes (ARGs), with zoonotic strains showing higher virulence compared to avian-adapted strains. Multiple plasmid types carrying ARGs were identified, highlighting the potential for horizontal gene transfer. Pangenomic and phylogenomic analyses differentiated Salmonella strains from serogroup D1 from those from serogroup B. These findings emphasize the need for comprehensive surveillance and control measures to mitigate the impact of Salmonella on both animal and human health in Brazil.

RevDate: 2025-03-06
CmpDate: 2025-03-06

Versoza CJ, Ehmke EE, Jensen JD, et al (2025)

Characterizing the Rates and Patterns of De Novo Germline Mutations in the Aye-Aye (Daubentonia madagascariensis).

Molecular biology and evolution, 42(3):.

Given the many levels of biological variation in mutation rates observed to date in primates-spanning from species to individuals to genomic regions-future steps in our understanding of mutation rate evolution will not only be aided by a greater breadth of species coverage across the primate clade but also by a greater depth as afforded by an evaluation of multiple trios within individual species. In order to help bridge these gaps, we here present an analysis of a species representing one of the most basal splits on the primate tree (aye-ayes), combining whole-genome sequencing of seven parent-offspring trios from a three-generation pedigree with a novel computational pipeline that takes advantage of recently developed pan-genome graphs, thereby circumventing the application of (highly subjective) quality metrics that has previously been shown to result in notable differences in the detection of de novo mutations and ultimately estimates of mutation rates. This deep sampling has enabled both a detailed picture of parental age effects and sex dependency in mutation rates, which we here compare with previously studied primates, but has also provided unique insights into the nature of genetic variation in one of the most endangered primates on the planet.

RevDate: 2025-03-05

Kamal N, M Spannagl (2025)

Genus-wide plant pangenome could inform next-generation crop design.

RevDate: 2025-03-05

Benoit M, Jenike KM, Satterlee JW, et al (2025)

Solanum pan-genetics reveals paralogues as contingencies in crop engineering.

Nature [Epub ahead of print].

Pan-genomics and genome-editing technologies are revolutionizing breeding of global crops[1,2]. A transformative opportunity lies in exchanging genotype-to-phenotype knowledge between major crops (that is, those cultivated globally) and indigenous crops (that is, those locally cultivated within a circumscribed area)[3-5] to enhance our food system. However, species-specific genetic variants and their interactions with desirable natural or engineered mutations pose barriers to achieving predictable phenotypic effects, even between related crops[6,7]. Here, by establishing a pan-genome of the crop-rich genus Solanum[8] and integrating functional genomics and pan-genetics, we show that gene duplication and subsequent paralogue diversification are major obstacles to genotype-to-phenotype predictability. Despite broad conservation of gene macrosynteny among chromosome-scale references for 22 species, including 13 indigenous crops, thousands of gene duplications, particularly within key domestication gene families, exhibited dynamic trajectories in sequence, expression and function. By augmenting our pan-genome with African eggplant cultivars[9] and applying quantitative genetics and genome editing, we dissected an intricate history of paralogue evolution affecting fruit size. The loss of a redundant paralogue of the classical fruit size regulator CLAVATA3 (CLV3)[10,11] was compensated by a lineage-specific tandem duplication. Subsequent pseudogenization of the derived copy, followed by a large cultivar-specific deletion, created a single fused CLV3 allele that modulates fruit organ number alongside an enzymatic gene controlling the same trait. Our findings demonstrate that paralogue diversifications over short timescales are underexplored contingencies in trait evolvability. Exposing and navigating these contingencies is crucial for translating genotype-to-phenotype relationships across species.

RevDate: 2025-03-05

Roberts MD, Davis O, Josephs EB, et al (2025)

k-mer-based approaches to bridging pangenomics and population genetics.

Molecular biology and evolution pii:8052716 [Epub ahead of print].

Many commonly studied species now have more than one chromosome-scale genome assembly, revealing a large amount of genetic diversity previously missed by approaches that map short reads to a single reference. However, many species still lack multiple reference genomes and correctly aligning references to build pangenomes can be challenging for many species, limiting our ability to study this missing genomic variation in population genetics. Here, we argue that k-mers are a very useful but underutilized tool for bridging the reference-focused paradigms of population genetics with the reference-free paradigms of pangenomics. We review current literature on the uses of k-mers for performing three core components of most population genetics analyses: identifying, measuring, and explaining patterns of genetic variation. We also demonstrate how different k-mer-based measures of genetic variation behave in population genetic simulations according to the choice of k, depth of sequencing coverage, and degree of data compression. Overall, we find that k-mer-based measures of genetic diversity scale consistently with pairwise nucleotide diversity (π) up to values of about π = 0.025 (R2 = 0.97) for neutrally evolving populations. For populations with even more variation, using shorter k-mers will maintain the scalability up to at least π = 0.1. Furthermore, in our simulated populations, k-mer dissimilarity values can be reliably approximated from counting bloom filters, highlighting a potential avenue to decreasing the memory burden of k-mer based genomic dissimilarity analyses. For future studies, there is a great opportunity to further develop methods to identifying selected loci using k-mers.

RevDate: 2025-03-05
CmpDate: 2025-03-05

Coggi M, Sgarlata A, Di Donato GW, et al (2024)

On the optimization of GWFA algorithm: enabling real-case applications supporting alignment backtracking.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference, 2024:1-4.

The Human Pangenome Reference Consortium (HPRC) proved that pangenome graphs represent a population's genetic variability more efficiently and accurately than linear references. Graphs can intrinsically encode variations as alternative paths inside a directed set of sequence nodes connected by edges. Despite their higher complexity, graph-based genome analysis pipelines are gaining significant interest, and the first sequence-to-graph aligners have already shown improvements in semi-global alignment. However, in pangenomics studies, the global alignment of long reads is fundamental for identifying structural variations and haplotype phasing. In this context, the Graph Wavefront Alignment (GWFA) algorithm emerged as the fastest strategy for aligning long reads to genomic graphs. However, the available GWFA implementation does not support alignment backtracking, a crucial feature in real-case studies. In this paper, we propose a new open-source[1] implementation of the GWFA algorithm that computes and reports the complete traceback in the standard GAF format. Our work achieves a 20× speedup in execution time compared to the state-of-the-art tool GraphAligner and competitive memory usage.

RevDate: 2025-03-04

Nassir N, A Almarri M, Akter H, et al (2025)

Advancing clinical genomics with Middle Eastern and South Asian pangenomes.

Nature medicine [Epub ahead of print].

RevDate: 2025-03-04

Wang N, Wang W, Q Zhu (2025)

Unlocking the genetic blueprint of bamboo for climate adaption.

Trends in plant science pii:S1360-1385(25)00041-X [Epub ahead of print].

In a recent study, Hou et al. developed a high-resolution, haplotype-based pangenome for moso bamboo (Phyllostachys edulis), revealing significant genetic diversity and over 1000 climate-associated variants. Their findings highlight adaptive mechanisms for the ecological resilience of bamboo, providing crucial insights for climate-resilient breeding and conservation to ensure the long-term ecological and economic benefits of moso bamboo amid climate change.

RevDate: 2025-03-04

Horsfield ST, Fok BCT, Fu Y, et al (2025)

Optimizing nanopore adaptive sampling for pneumococcal serotype surveillance in complex samples using the graph-based GNASTy algorithm.

Genome research pii:gr.279435.124 [Epub ahead of print].

Serotype surveillance of Streptococcus pneumoniae (the pneumococcus) is critical for understanding the effectiveness of current vaccination strategies. However, existing methods for serotyping are limited in their ability to identify the co-carriage of multiple pneumococci and detect novel serotypes. To develop a scalable and portable serotyping method that overcomes these challenges, we employed Nanopore Adaptive Sampling (NAS), an on-sequencer enrichment method that selects for target DNA in real-time, for direct detection of S. pneumoniae in complex samples. Whereas NAS targeting the whole S. pneumoniae genome was ineffective in the presence of nonpathogenic streptococci, the method was both specific and sensitive when targeting the capsular biosynthetic locus (CBL), the operon that determines S. pneumoniae serotype. NAS significantly improved coverage and yield of the CBL relative to sequencing without NAS, and accurately quantified the relative prevalence of serotypes in samples representing co-carriage. To maximize the sensitivity of NAS to detect novel serotypes, we developed and benchmarked a new pangenome-graph algorithm, named GNASTy. We show that GNASTy outperforms the current NAS implementation, which is based on linear genome alignment, when a sample contains a serotype absent from the database of targeted sequences. The methods developed in this work provide an improved approach for novel serotype discovery and routine S. pneumoniae surveillance that is fast, accurate and feasible in low-resource settings. Although NAS facilitates whole-genome enrichment under ideal circumstances, GNASTy enables targeted enrichment to optimize serotype surveillance in complex samples.

RevDate: 2025-03-04

Phillips A, Schultz CJ, RA Burton (2025)

New crops on the block: effective strategies to broaden our food, fibre and fuel repertoire in the face of increasingly volatile agricultural systems.

Journal of experimental botany pii:8045012 [Epub ahead of print].

Climate change poses significant challenges to our ability to keep a growing global population fed, clothed, and fuelled. This review sets the scene by summarising the impacts of climate change on production of the major grain crop species rice, wheat, and maize, with a focus on yield reductions due to abiotic stresses and altered disease pressures. We discuss efforts to improve resilience, emphasising traits such as water use efficiency (WUE), heat tolerance, and disease resistance. We move on to exploring production trends of established, re-emerging, and new crops, highlighting the challenges of developing and maintaining new arrivals in the global market. We analyse the potential of wild relatives for improving domesticated crops, or as candidates for de novo domestication. The importance of pangenomes for uncovering genetic variation for crop improvement is also discussed. We examine the impact of climate change on non-cereals, including fruit, nut, and fibre crops and the potential of alternative multiuse crops to increase global sustainability and address climate change-related challenges. Agave is used as an exemplar to demonstrate the strategic pathway for developing a robust new crop option. There is a need for sustained investment in research and development across the entire value chain to facilitate the exploration of diverse species and genetic resources to enhance crop resilience and adaptability to future environmental conditions.

RevDate: 2025-03-04

Villani F, Guarracino A, Ward RR, et al (2025)

Pangenome reconstruction in rats enhances genotype-phenotype mapping and variant discovery.

iScience, 28(2):111835.

The HXB/BXH family of recombinant inbred rat strains is a unique genetic resource that has been extensively phenotyped over 25 years, resulting in a vast dataset of quantitative molecular and physiological phenotypes. We built a pangenome graph from 10x Genomics Linked-Read data for 31 recombinant inbred rats to study genetic variation and association mapping. The pangenome includes 0.2Gb of sequence that is not present the reference mRatBN7.2, confirming the capture of substantial additional variation. We validated variants in challenging regions, including complex structural variants resolving into multiple haplotypes. Phenome-wide association analysis of validated SNPs uncovered variants associated with glucose/insulin levels and hippocampal gene expression. We propose an interaction between Pirl1l1, chromogranin expression, TNF-α levels, and insulin regulation. This study demonstrates the utility of linked-read pangenomes for comprehensive variant detection and mapping phenotypic diversity in a widely used rat genetic reference panel.

RevDate: 2025-03-03

Yano R, Li F, Hiraga S, et al (2025)

The genomic landscape of gene-level structural variations in Japanese and global soybean Glycine max cultivars.

Nature genetics [Epub ahead of print].

Japanese soybeans are traditionally bred to produce soy foods such as tofu, miso and boiled soybeans. Here, to investigate their distinctive genomic features, including genomic structural variations (SVs), we constructed 11 nanopore-based genome references for Japanese and other soybean lines. Our assembly-based comparative method, designated 'Asm2sv', identified gene-level SVs comprehensively, enabling pangenome analysis of 462 worldwide cultivars and varieties. Based on these, we identified selective sweeps between Japanese and US soybeans, one of which was the pod-shattering resistance gene PDH1. Genome-wide association studies further identified several quantitative trait loci that accounted for large-seed phenotypes of Japanese soybean lines, some of which were also close to regions of the selective sweeps, including PDH1. Notably, specific combinations of alleles, including SVs, were found to increase the seed size of some Japanese landraces. In addition to the differences in cultivation environments, distinct food processing usages might result in changes in Japanese soybean genomes.

RevDate: 2025-03-03

Lee RRQ, E Chae (2025)

Monkeys at Rigged Typewriters: A Population and Network View of Plant Immune System Incompatibility.

Annual review of plant biology [Epub ahead of print].

Immune system incompatibilities between naturally occurring genomic variants underlie many hybrid defects in plants and present a barrier for crop improvement. In this review, we approach immune system incompatibilities from pan-genomic and network perspectives. Pan-genomes offer insights into how natural variation shapes the evolutionary landscape of immune system incompatibilities, and through it, selection, polymorphisms, and recombination resistance emerge as common features that synergistically drive these incompatibilities. By contextualizing incompatibilities within the immune network, immune receptor promiscuity, complex dysregulation, and single-point failure appear to be recurrent themes of immune system defects. As geneticists break genes to investigate their function, so can we investigate broken immune systems to enrich our understanding of plant immune systems and work toward improving them.

RevDate: 2025-03-03

Bouzek H, Srinivasan S, Jones DS, et al (2025)

A Syntenic Pangenome for Gardnerella Reveals Taxonomic Boundaries and Stratification of Metabolic and Virulence Potential across Species.

bioRxiv : the preprint server for biology pii:2025.02.19.636902.

Bacterial vaginosis (BV) is a prevalent condition associated with an imbalance in the vaginal microbiota, often involving species of Gardnerella . The taxonomic complexity and inconsistent nomenclature of Gardnerella have impeded progress in understanding the role of specific species in health and disease. In this study, we conducted a comprehensive genomic and pangenomic analysis to resolve taxonomic ambiguities and elucidate metabolic and virulence potential across Gardnerella species. We obtained complete, closed genomes for 42 Gardnerella isolates from women with BV and curated publicly available genome sequences (n = 291). Average nucleotide identity (ANI) analysis, digital DNA-DNA hybridization (dDDH), and the cpn60 gene sequences identified nine species and eleven subspecies within Gardnerella , for which we refined species and subspecies boundaries and proposed updated nomenclature. Pangenome analysis revealed species-specific gene clusters linked to metabolic pathways, virulence factors, and niche adaptations, distinguishing species specialized for mucin degradation in the vaginal environment from those potentially adapted to urinary tract colonization. Notably, we identified lineage-specific evolutionary divergence in gene clusters associated with biofilm formation, carbohydrate metabolism, and antimicrobial resistance. We further discovered the first cryptic plasmids naturally present within the Gardnerella genus. Our findings provide a unified framework for Gardnerella taxonomy and nomenclature, and enhance our understanding of species-specific functional capabilities, with implications for Gardnerella research, diagnostics, and targeted therapeutics in BV.

RevDate: 2025-03-02

Hwang S, Brown NK, Ahmed OY, et al (2025)

Mem-based pangenome indexing for k-mer queries.

Algorithms for molecular biology : AMB, 20(1):3.

Pangenomes are growing in number and size, thanks to the prevalence of high-quality long-read assemblies. However, current methods for studying sequence composition and conservation within pangenomes have limitations. Methods based on graph pangenomes require a computationally expensive multiple-alignment step, which can leave out some variation. Indexes based on k-mers and de Bruijn graphs are limited to answering questions at a specific substring length k. We present Maximal Exact Match Ordered (MEMO), a pangenome indexing method based on maximal exact matches (MEMs) between sequences. A single MEMO index can handle arbitrary-length queries over pangenomic windows. MEMO enables both queries that test k-mer presence/absence (membership queries) and that count the number of genomes containing k-mers in a window (conservation queries). MEMO's index for a pangenome of 89 human autosomal haplotypes fits in 2.04 GB, 8.8 × smaller than a comparable KMC3 index and 11.4 × smaller than a PanKmer index. MEMO indexes can be made smaller by sacrificing some counting resolution, with our decile-resolution HPRC index reaching 0.67 GB. MEMO can conduct a conservation query for 31-mers over the human leukocyte antigen locus in 13.89 s, 2.5 × faster than other approaches. MEMO's small index size, lack of k-mer length dependence, and efficient queries make it a flexible tool for studying and visualizing substring conservation in pangenomes.

RevDate: 2025-03-02

Sliti A, Kim RH, Lee D, et al (2025)

Whole Genome Sequencing and In Silico Analysis of the Safety and Probiotic Features of Lacticaseibacillus paracasei FMT2 Isolated from Fecal Microbiota Transplantation (FMT) Capsules.

Microbial pathogenesis pii:S0882-4010(25)00130-5 [Epub ahead of print].

Lacticaseibacillus paracasei is widely used as a probiotic supplement and food additive in the medicinal and food industries. However, its application requires careful evaluation of safety traits associated with probiotic pathogenesis, including the transfer of antibiotic-resistance genes, the presence of virulence and pathogenicity factors, and the potential disruptions of the gut microbiome and immune system. In this study, we conducted whole genome sequencing (WGS) of L. paracasei FMT2 isolated from fecal microbiota transplantation (FMT) capsules and performed genome annotation to assess its probiotic and safety attributes. Our comparative genomic analysis assessed this novel strain's genetic attributes and functional diversity and unraveled its evolutionary relationships with other L. paracasei strains. The assembly yielded three contigs: one corresponding to the chromosome and two corresponding to plasmids. Genome annotation revealed the presence of 2,838 DNA-coding sequences (CDS), 78 ribosomal RNAs (rRNAs), 60 transfer RNAs (tRNAs), three non-coding RNAs (ncRNAs), and 126 pseudogenes. The strain lacked antibiotic resistance genes and pathogenicity factors. Two intact prophages, one Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) region, and three antimicrobial peptide gene clusters were identified, highlighting the genomic stability and antimicrobial potential of the strain. Furthermore, genes linked to probiotic functions, such as mucosal colonization, stress resistance, and biofilm formation, were characterized. The pan-genome analysis identified 3,358 orthologous clusters, including 1,775 single-copy clusters, across all L. paracasei strains. Notably, L. paracasei FMT2 contained many unique singleton genes, potentially contributing to its distinctive probiotic properties. Our findings confirm the potential of L. paracasei FMT2 for food and therapeutic applications based on its probiotic profile and safety.

RevDate: 2025-03-02
CmpDate: 2025-02-28

Jaggi KE, Krak K, Štorchová H, et al (2025)

A pangenome reveals LTR repeat dynamics as a major driver of genome evolution in Chenopodium.

The plant genome, 18(1):e70010.

The genus Chenopodium L. is characterized by its wide geographic distribution and ecological adaptability. Species such as quinoa (Chenopodium quinoa Willd.) have served as domesticated staple crops for centuries. Wild Chenopodium species exhibit diverse niche adaptations and are important genetic reservoirs for beneficial agronomic traits, including disease resistance and climate hardiness. To harness the potential of the wild taxa for crop improvement, we developed a Chenopodium pangenome through the assembly and comparative analyses of 12 Chenopodium species that encompass the eight known genome types (A-H). Six of the species are new chromosome-scale assemblies, and many are polyploids; thus, a total of 20 genomes were included in the pangenome analyses. We show that the genomes vary dramatically in size with the D genome being the smallest (∼370 Mb) and the B genome being the largest (∼700 Mb) and that genome size was correlated with independent expansions of the Copia and Gypsy LTR retrotransposon families, suggesting that transposable elements have played a critical role in the evolution of the Chenopodium genomes. We annotated a total of 33,457 pan-Chenopodium gene families, of which ∼65% were classified as shell (2% private). Phylogenetic analysis clarified the evolutionary relationships among the genome lineages, notably resolving the taxonomic placement of the F genome while highlighting the uniqueness of the A genome in the Western Hemisphere. These genomic resources are particularly important for understanding the secondary and tertiary gene pools available for the improvement of the domesticated chenopods while furthering our understanding of the evolution and complexity within the genus.

RevDate: 2025-02-28

Munung NS (2025)

Science and Society: Pathways to Equitable Access and Delivery of Genomics Medicine in Africa.

Current genetic medicine reports, 13(1):1.

PURPOSE OF REVIEW: Recent advances in genetics are pushing the frontiers of health research in Africa. Notable developments include the release of the draft human pangenome reference, regulatory approval of gene editing therapies for sickle cell disease, and the announcements of major initiatives such as the Ghana Genome Project, the Personalized Medicine in North Africa Initiative, Nigeria's 100K Genome Project and South Africa's 110K Human Genomes Project. Additionally, gene-based therapies for HIV are on the horizon, with clinical trials planned in some African countries. Despite this progress, a pressing challenge remains: ensuring equitable access and delivery of genomics medicine worldwide, particularly in Africa and other low and middle income regions.

SUMMARY AND A CALL TO ACTION: Science diplomacy and academic-industry partnerships are key to achieving "Genomics for All." This requires collaboration between African governments, academic institutions, funding agencies, commercial biotechnology companies, civil society, and international health organizations. Together, these stakeholders must define and establish a sustainable framework to support genetic research in Africa, increase the availability of genetic data from African populations, and set-up translational genomics medicine initiatives tailored to the continent's unique healthcare needs. Science advocacy and diplomacy is also needed to establish mechanisms that prevent the hoarding of genetic resources, including genetic data and novel interventions, and guarantee equitable access to the scientific, medical and economic benefits of genomics for all nations. Achieving this vision may necessitate international treaties to promote equitable access to genomic innovations, responsible and ethical cross-border data sharing, and long-term strategies to address funding gaps in genomic research and its application in medicine and healthcare in Africa.

RevDate: 2025-02-26

Guo L, Wang X, Ayhan DH, et al (2025)

Super pangenome of Vitis empowers identification of downy mildew resistance genes for grapevine improvement.

Nature genetics [Epub ahead of print].

Grapevine (Vitis) is one of the oldest domesticated fruit crops with great cultural and economic importance. Here we assembled and annotated haplotype-resolved genomes of 72 global Vitis accessions including 25 wild and 47 cultivated grapevines, among which genomes for 60 grapevines are newly released. Haplotype-aware phylogenomics disentangled the mysterious hybridization history of grapevines, revealing the enormous genetic diversity of the Vitis genus. Pangenomic analysis reveals that European cultivars, more susceptible to the destructive disease downy mildew (DM), have a smaller repertoire of resistance genes in the NLR family encoding the TIR-NBARC-LRR domain. Through extensive structural variation (SV) characterization, phenotyping, DM-infection transcriptome profiling of 113 Vitis accessions, and SV-expression quantitative trait loci analysis, we have identified over 63 SVs and their relevant genes significantly associated with DM resistance, exemplified by a lysine histidine transporter, VvLHT8. This haplotype-resolved super pangenome of the Vitis genus will accelerate breeding and enrich our understanding of the evolution and biology of grapevines.

RevDate: 2025-02-27

Man QC, Wang YQ, Gao SJ, et al (2024)

Pan-genome analysis and expression verification of the maize ARF gene family.

Frontiers in plant science, 15:1506853.

Auxin transcription factors regulate auxin responses and play crucial roles in plant growth, development, and responses to abiotic stress. Utilizing the maize pan-genome data, this study identified 35 ARF family members in maize, comprising 21 core genes, 10 near-core genes, and 4 non-essential genes; no private genes were detected. The construction of a phylogenetic tree using Arabidopsis thaliana revealed that the G3 subfamily comprises the highest number of core genes, with a total of 10, and exhibits relative stability throughout the evolution of maize. The calculation of the Ka/Ks ratios for ARF family members across 26 genomes indicated that, aside from ARF8 and ARF11, which were subjected to positive selection, the remaining genes underwent purifying selection. Analysis of structural variation revealed that the expression level of the ARF4 gene significantly differed as a result of this variation. Simultaneously, the structural variation also influenced the conserved domain and cis-acting elements of the gene. Further combining the transcriptome data and RT-qPCR found that, The expression levels of ARF family members in maize were higher at the early stage of embryo and grain development, and the expression levels of each member in embryo and grain were complementary, and the ARF4 plays an important role in abiotic stress. In summary, this study utilizes the maize pan-genome and bioinformatics methods to investigate the evolutionary relationships and functional roles of ARF family members in maize, thereby providing a novel theoretical framework for further research on the maize ARF family.

RevDate: 2025-02-26

Wang D, Xie J, Wang J, et al (2025)

Unraveling Allelic Impacts on Pre-Harvest Sprouting Resistance in TaVP1-B of Chinese Wheat Accessions Using Pan-Genome.

Plants (Basel, Switzerland), 14(4):.

The TaVP1-B gene, located on the 3B chromosome of wheat, is a homolog of the Viviparous-1 (VP-1) gene of maize and was reported to confer resistance to pre-harvest sprouting (PHS) in wheat. In this study, the structure of the TaVP1-B gene was analyzed using the wheat pan-genome consisting of 20 released cultivars (19 wheat are from China), and 3 single nucleotide polymorphisms (SNPs), which were identified at the 496 bp, 524 bp, and 1548 bp of the TaVP1-B CDS region, respectively. Haplotypes analysis showed that these SNPs were in complete linkage disequilibrium and that only two haplotypes designated as hap1 (TGG) and hap2 (GAA) were present. Association analysis between TaVP1-B haplotypes and PHS resistance of the 20 wheat cultivars in four experiment environments revealed that the average PHS resistance of accessions with hap1 was significantly better than that of accessions with hap2, which infers the effects of TaVP1-B on wheat PHS resistance. To further investigate the impacts of alleles at the TaVP1-B locus on PHS resistance, the SNP at 1548 bp of the TaVP1-B CDS region was converted to a KASP marker, which was used for genotyping 304 Chinese wheat cultivars, whose PHS resistance was evaluated in three environments. The average sprouting rates (SRs) of 135 wheat cultivars with the hap1 were significantly lower than the 169 cultivars with the hap2, validating the impacts of TaVP1-B on PHS resistance in Chinese wheat. The present study provided the breeding-friendly marker for functional variants in the TaVP1-B gene, which can be used for genetic improvement of PHS resistance in wheat.

RevDate: 2025-02-26

Ye M, Jiang Y, Han Q, et al (2025)

Probiotic Potential of Enterococcus lactis GL3 Strain Isolated from Honeybee (Apis mellifera L.) Larvae: Insights into Its Antimicrobial Activity Against Paenibacillus larvae.

Veterinary sciences, 12(2):.

This study aimed to address the need for effective probiotics and antibacterial agents to combat American foulbrood disease in honeybees, caused by Paenibacillus larvae. In the context of declining honeybee populations due to pathogens, we isolated eight lactic acid bacteria (LAB) strains from honeybee larvae (Apis mellifera L.) and evaluated their probiotic potential and inhibitory effects against P. larvae. Methods included probiotic property assessments, such as acid and bile salt resistance, hydrophobicity, auto-aggregation, co-aggregation with P. larvae, antioxidant capacities, osmotolerance to 50% sucrose, and antibiotic susceptibility. Results indicated that the GL3 strain exhibited superior probiotic attributes and potent inhibitory effects on P. larvae. Whole-genome sequencing revealed GL3 to be an Enterococcus lactis strain with genetic features tailored to the honeybee larval gut environment. Pangenome analysis highlighted genetic diversity among E. lactis strains, while molecular docking analysis identified aborycin, a lasso peptide produced by GL3, as a promising inhibitor of bacterial cell wall synthesis. These findings suggested that GL3 was a promising probiotic candidate and antibacterial agent for honeybee health management, warranting further investigation into its in vivo efficacy and potential applications in beekeeping practices.

RevDate: 2025-02-26

Fauzia KA, Rathnayake J, Doohan D, et al (2025)

Beyond Low Prevalence: Exploring Antibiotic Resistance and Virulence Profiles in Sri Lankan Helicobacter pylori with Comparative Genomics.

Microorganisms, 13(2): pii:microorganisms13020420.

Helicobacter pylori infects at least half the population worldwide, and its highly diverse genomic content correlates with its geographic distribution because of its prolonged relationship with humans. The extremely low infection prevalence alongside low inflammation severity observed in some countries might be caused by strains with low virulence potential. Therefore, this study aimed to investigate whole-genome analysis datasets of Sri Lankan H. pylori strains. H. pylori strains were isolated from biopsy specimens and underwent whole-genome sequencing to investigate their antibiotic resistance and virulence potential. The prevalence of H. pylori infection in Sri Lanka is extremely low (1.7% in a previous study), and only six H. pylori strains were successfully isolated from bacterial culture. Antibiotic resistance analysis showed a high prevalence of metronidazole resistance (83.3%, five out of six strains), and investigation of the related genes showed truncation of the rdxA and frxA genes and single-nucleotide polymorphisms in the rdxA, frxA, ribF, omp11, and fur genes. Most virulence genes of the 144 assessed were present, except for the cag pathogenicity island (cagPAI) (absent in four out of six strains), babA/B/C, and tlpB genes. An incomplete type 4 secretion system (tfs) was found in three strains. A pan-genome analysis with non-Sri Lankan H. pylori strains showed that the htpX gene was found only in Sri Lankan strains (p-corrected = 0.0008). A phylogenetic analysis showed that the Sri Lankan strains clustered with strains from hpAsia2 and hpEurope. This comparative genomic study shows that H. pylori strains with low virulence potential are present in countries with a low prevalence of infection and disease severity, indicating a strain-type geographical pattern. The tailored guidelines for screening and treatment strategy for each region are necessary to obtain effective and efficient eradication.

RevDate: 2025-02-26
CmpDate: 2025-02-26

Dudley EP, Scott MA, Kittana H, et al (2025)

The Pathogenomics of the Respiratory Mycoplasma bovis Strains Circulating in Cattle Around the Texas Panhandle, USA.

Pathogens (Basel, Switzerland), 14(2): pii:pathogens14020167.

Bovine respiratory disease (BRD) is a major economic and animal welfare issue in the beef industry. Mycoplasma bovis is one of the main causal organisms, particularly in chronic cases. Due to the difficulty of isolating M. bovis from clinical isolates, there is a lack of information on the genetic diversity of this pathogen in the Texas panhandle region of the United States. Therefore, our objective was to provide genome-level characterization of M. bovis isolated from the lung lesions of beef and dairy cattle in the Texas panhandle. Fifty-four isolates displaying mycoplasma-like growth were recovered from bovine lung lesions by the Texas Veterinary Medical Diagnostic Laboratory in 2021 and 2022. Of these isolates, 32 were determined to be M. bovis via species-specific qPCR using the uvrC gene. Long-read whole-genome sequencing was used to identify key virulence factors, antimicrobial resistance genes, and to assess the genetic diversity of these isolates. Fisher's exact tests were used to identify associations between isolate characteristics and host metadata, including the state of origin, type of operation, animal age, and animal sex. Our results indicate that there is considerable genetic diversity among the M. bovis isolates, despite their shared geography in the Texas panhandle, though significant clustering based on host metadata was observed. Analysis of the pangenome showed that the M. bovis isolates in this study also harbor a diverse array of virulence genes, but no antimicrobial resistance genes were identified in this study.

RevDate: 2025-02-26

Morey-León G, Fernández-Cadena JC, Andrade-Molina D, et al (2025)

Decoding Ecuadorian Mycobacterium tuberculosis Isolates: Unveiling Lineage-Associated Signatures in Beta-Lactamase Resistance via Pangenome Analysis.

Biomedicines, 13(2): pii:biomedicines13020313.

Background: Tuberculosis is the second largest public health threat caused by pathogens. Understanding Mycobacterium tuberculosis's transmission, virulence, and resistance profile is crucial for outbreak control. This study aimed to investigate the pangenome composition of Mycobacterium tuberculosis clinical isolates classified as L4 derived from Ecuador. Methods: We analyzed 88 clinical isolates of Mycobacterium tuberculosis by whole-genome sequencing (WGS) and bioinformatic tools for Lineage, Drug-resistance and Pangenome analysis. Results: In our analysis, we identified the dominance of the LAM lineage (44.3%). The pangenomic analysis revealed a core genome of approximately 3200 genes and a pangenome that differed in accessory and unique genes. According to the COG database, metabolism-related genes were the most representative of all partitions. However, differences were found within all lineages analyzed in the metabolic pathways described by KEGG. Isolates from Ecuador showed variations in genomic regions associated with beta-lactamase susceptibility, potentially leading to epistatic resistance to other drugs commonly used in TB treatment, warranting further investigation. Conclusions: Our findings provide valuable insights into the genetic diversity of Mycobacterium tuberculosis populations in Ecuador. These insights may be associated with increasing adaptation within host heterogeneity, variable latency periods, and reduced host damage, collectively contributing to disease spread. The application of WGS is essential to elucidating the epidemiology of TB in the country.

RevDate: 2025-02-25
CmpDate: 2025-02-25

Du W, Sun Q, Hu S, et al (2025)

Equus mitochondrial pangenome reveals independent domestication imprints in donkeys and horses.

Scientific reports, 15(1):6803.

Mitochondria are semi-autonomous organelles that play a crucial role in the energy budget of animal cells and are closely related to the locomotor abilities of animals. Equidae is renowned for including two domesticated species with distinct purposes: the endurance-oriented donkey and the power-driven horse, making it an ideal system for studying the relationship between mitochondria and locomotor abilities. In this study, to cover the genetic diversity of donkeys, we sequenced and assembled six new mitochondrial genomes from China. Meanwhile, we downloaded the published mitochondrial genomes of all species within Equus and conducted a comprehensive pan-mitochondrial genome analysis. We found that the mitochondrial genomes of Equus are highly conserved, each encoding 37 genes, including 13 protein-coding genes (PCGs). Phylogenetic analysis based on mitochondrial genomes supports previous research, indicating that the extant species in Equus are divided into three main branches: horses, donkeys, and zebras. Specifically, 761 genetic variants were identified between donkeys and horses, 68 of which were non-synonymous mutations in PCGs, potentially linked to their different locomotor abilities. Structural protein modeling indicated that despite genetic differences, the overall protein structures between donkeys and horses remain similar. This study revealed the mitochondrial genome variation patterns of domesticated animals, offering novelty perspectives on domestication imprints. Additionally, it provides reliable candidate molecular markers for the identification of donkeys and horses.

RevDate: 2025-02-25

Michealsamy A, S Jayapalan (2025)

Comparative Pan- and Phylo-Genomic Analysis of Ideonella and Thermobifida Strains: Dissemination of Biodegradation Potential and Genomic Divergence.

Biochemical genetics [Epub ahead of print].

Ideonella and Thermobifida were the most promising bacterial candidates for degrading plastic polymers. A comparative pan- and phylogenomic analysis of 33 Ideonella and Thermobifida strains was done to determine their plastic degradation potential, niche adaptation and speciation. Our study disclosed that more accessory genes in the strains showed phenotypic plasticity, according to the BPGA data. Pan and core genes were employed for the phylogenetic reconstruction. Pathway enrichment analyses scrutinized the functional roles of the core and adaptive-associated genes. KEGG annotation revealed that most genes were associated with the metabolism of amino acids and carbohydrates. The detailed COG analysis disclosed that approximately 40% of the pan genes performed metabolic functions. The unique gene pool consisted of genes chiefly involved in "general function prediction" and "amino acid transport and metabolism". Our in silico study revealed that these strains could assist in agronomic applications in the future since they devour nitrogen compounds and their central metabolic pathways are involved in amino acid metabolism. The rational selection of strains of Ideonella is far more effective at depolymerising plastics than Thermobifida. A greater number of unique genes, 1701 and 692, were identified for Ideonella sakaiensis 201-F6 and Thermobifida alba DSM-43795, respectively. Furthermore, we examined the singletons involved in xenobiotic catabolism. The unique singleton data were used to construct a supertree. To characterize the conserved patterns, we used SMART and MEME to identify domain and transmembrane regions in the unique protein sequences. Therefore, our study unraveled the genomic insights into the ecology-driven speciation of Ideonella and Thermobifida.

RevDate: 2025-02-26

He Y, Liu B, Ouyang X, et al (2025)

Whole-Genome Sequencing and Fine Map Analysis of Pholiota nameko.

Journal of fungi (Basel, Switzerland), 11(2):.

Pholiota nameko (T. Ito) S. Ito and S. Imai is an emerging wild mushroom species belonging to the genus Pholiota. Its unique brown-yellow appearance and significant biological activity have garnered increasing attention in recent years. However, there is a relative lack of research on the biological characteristics and genetics of P. nameko, which greatly limits the potential for an in-depth exploration of this mushroom in the research fields of molecular breeding and evolutionary biology. This study aimed to address that gap by employing Illumina and Nanopore sequencing technologies to perform whole-genome sequencing, de novo assembly, and annotation analysis of the P. nameko ZZ1 strain. Utilizing bioinformatics methods, we conducted a comprehensive analysis of the genomic characteristics of this strain and successfully identified candidate genes associated with its mating type, carbohydrate-active enzymes, virulence factors, pan-genome, and drug resistance functions. The genome of P. nameko ZZ1 is 24.58 Mb in size and comprises 33 contigs, with a contig N50 of 2.11 Mb. A hylogenetic analysis further elucidated the genetic relationship between P. nameko and other Pholiota, revealing a high degree of collinearity between P. nameko and ZZ1. In our enzyme analysis, we identified 246 enzymes in the ZZ1 genome, including 68 key carbohydrate-active enzymes (CAZymes), and predicted the presence of 11 laccases, highlighting the strain's strong potential for cellulose degradation. We conducted a pan-genomic analysis of five closely related strains of Pholiota, yielding extensive genomic information. Among these, there were 2608 core genes, accounting for 21.35% of the total genes, and 135 dispensable genes, highlighting significant genetic diversity among Pholiota and further confirming the value of pan-genomic analysis in uncovering species diversity. Notably, while we successfully identified the A-mating-type locus, composed of the homeodomain protein genes HD1 and HD2 in ZZ1, we were unable to obtain the B-mating-type locus due to technical limitations, preventing us from acquiring the pheromone receptor of the B-mating-type. We plan to supplement these data in future studies and explore the potential impact of the B-mating-type locus on the current findings. In summary, the genome data of ZZ1 presented in this study are not only valuable resources for understanding the genetic basis of this species, but also serve as a crucial foundation for subsequent genome-assisted breeding, research into cultivation technology, and the exploration of its nutritional and potential medicinal value.

RevDate: 2025-02-24

Salamzade R, LR Kalan (2025)

Context matters: assessing the impacts of genomic background and ecology on microbial biosynthetic gene cluster evolution.

mSystems [Epub ahead of print].

Encoded within many microbial genomes, biosynthetic gene clusters (BGCs) underlie the synthesis of various secondary metabolites that often mediate ecologically important functions. Several studies and bioinformatics methods developed over the past decade have advanced our understanding of both microbial pangenomes and BGC evolution. In this minireview, we first highlight challenges in broad evolutionary analysis of BGCs, including delineation of BGC boundaries and clustering of BGCs across genomes. We further summarize key findings from microbial comparative genomics studies on BGC conservation across taxa and habitats and discuss the potential fitness effects of BGCs in different settings. Afterward, recent research showing the importance of genomic context on the production of secondary metabolites and the evolution of BGCs is highlighted. These studies draw parallels to recent, broader, investigations on gene-to-gene associations within microbial pangenomes. Finally, we describe mechanisms by which microbial pangenomes and BGCs evolve, ranging from the acquisition or origination of entire BGCs to micro-evolutionary trends of individual biosynthetic genes. An outlook on how expansions in the biosynthetic capabilities of some taxa might support theories that open pangenomes are the result of adaptive evolution is also discussed. We conclude with remarks about how future work leveraging longitudinal metagenomics across diverse ecosystems is likely to significantly improve our understanding on the evolution of microbial genomes and BGCs.

RevDate: 2025-02-24

Monlong J, Chen X, Barseghyan H, et al (2025)

Long-read sequencing resolves the clinically relevant CYP21A2 locus, supporting a new clinical test for Congenital Adrenal Hyperplasia.

medRxiv : the preprint server for health sciences pii:2025.02.07.25321404.

Congenital Adrenal Hyperplasia (CAH), one of the most common inherited disorders, is caused by defects in adrenal steroidogenesis. It is potentially lethal if untreated and is associated with multiple comorbidities, including fertility issues, obesity, insulin resistance, and dyslipidemia. CAH can result from variants in multiple genes, but the most frequent cause is deletions and conversions in the segmentally duplicated RCCX module, which contains the CYP21A2 gene and a pseudogene. The molecular genetic test to identify pathogenic alleles is cumbersome, incomplete, and available from a limited number of laboratories. It requires testing parents for accurate interpretation, leading to healthcare inequity. Less severe forms are frequently misdiagnosed, and phenotype/genotype correlations incompletely understood. We explored whether emerging technologies could be leveraged to identify all pathogenic alleles of CAH, including phasing in proband-only cases. We targeted long-read sequencing outputs that would be practical in a clinical laboratory setting. Both HiFi-based and nanopore-based whole-genome long-read sequencing datasets could be mined to accurately identify pathogenic single-nucleotide variants, full gene deletions, fusions creating non-functional hybrids between the gene and pseudogene ("30-kb deletion"), as well as count the number of RCCX modules and phase the resulting multimodular haplotypes. On the Hi-Fi data set of 6 samples, the PacBio Paraphase tool was able to distinguish nine different mono-, bi-, and tri-modular haplotypes, as well as the 30-kb and whole gene deletions. To do the same on the ONT-Nanopore dataset, we designed a tool, Parakit, which creates an enriched local pangenome to represent known haplotype assemblies and map ClinVar pathogenic variants and fusions onto them. With few labels in the region, optical genome mapping was not able to reliably resolve module counts or fusions, although designing a tool to mine the dataset specifically for this region may allow doing so in the future. Both sequencing techniques yielded congruent results, matching clinically identified variants, and offered additional information above the clinical test, including phasing, count of RCCX modules, and status of the other module genes, all of which may be of clinical relevance. Thus long-read sequencing could be used to identify variants causing multiple forms of CAH in a single test.

RevDate: 2025-02-24

Edwards SV, Fang B, Khost D, et al (2025)

Comparative population pangenomes reveal unexpected complexity and fitness effects of structural variants.

bioRxiv : the preprint server for biology pii:2025.02.11.637762.

Structural variants (SVs) are widespread in vertebrate genomes, yet their evolutionary dynamics remain poorly understood. Using 45 long-read de novo genome assemblies and pangenome tools, we analyze SVs within three closely related species of North American jays (Aphelocoma, scrub-jays) displaying a 60-fold range in effective population size. We find rapid evolution of genome architecture, including ~100 Mb variation in genome size driven by dynamic satellite landscapes with unexpectedly long (> 10 kb) repeat units and widespread variation in gene content, influencing gene expression. SVs exhibit slightly deleterious dynamics modulated by variant length and population size, with strong evidence of adaptive fixation only in large populations. Our results demonstrate how population size shapes the distribution of SVs and the importance of pangenomes to characterizing genomic diversity.

RevDate: 2025-02-24

Prodanov T, Plender EG, Seebohm G, et al (2025)

Locityper: targeted genotyping of complex polymorphic genes.

bioRxiv : the preprint server for biology pii:2024.05.03.592358.

The human genome contains numerous structurally-variable polymorphic loci, including several hundred disease-associated genes, almost inaccessible for accurate variant calling. Here we present Locityper, a tool capable of genotyping such challenging genes using short and long-read whole genome sequencing. For each target, Locityper recruits and aligns reads to locus haplotypes, for instance extracted from a pangenome, and finds the likeliest haplotype pair by optimizing read alignment, insert size and read depth profiles. Locityper accurately genotypes up to 194 of 256 challenging medically relevant loci (95% haplotypes at QV33), an 8.8-fold gain compared to 22 genes achieved with standard variant calling pipelines. Furthermore, Locityper provides access to hyperpolymorphic HLA genes and other gene families, including KIR, MUC and FCGR. With its low running time of 1h10m per sample at 8 threads, Locityper is scalable to biobank-sized cohorts, enabling association studies for previously intractable disease-relevant genes.

RevDate: 2025-02-24
CmpDate: 2025-02-21

Ucuncu MY, Ucuncu MK, Karacan I, et al (2025)

Genome resequencing and comparative analysis of Streptococcus mutans in adults with high and low caries risk.

Scientific data, 12(1):313.

Streptococcus mutans, is considered the main microbial etiological agent of dental caries, therefore it has been proposed as a useful predictor of caries risk as well as a target for caries prevention strategies. We aimed to compare the genomic characteristics of S. mutans strains isolated from individuals with high and low caries risk, in order to determine their genotypic features related to dental caries in adults. A total of 25 S. mutans isolates, obtained from the saliva of 13 volunteers with high dental caries activity and 12 caries-free individuals, were analysed using whole-genome sequencing techniques. A total of 2904 protein-coding gene sequences were detected as a result of the pan-genome analysis. The number of core genes detected in all genomes sequenced in the study was found to be 1563. A total of 50584 mutations were detected using ATCC 25175 strain as a reference. This is a large genome dataset of 25 S. mutans strains which can be further used for all S. mutans genome analysis.

RevDate: 2025-02-22

Hurtado A, Ocejo M, Oporto B, et al (2025)

A One Health approach for the genomic characterization of antibiotic-resistant Campylobacter isolates using Nanopore whole-genome sequencing.

Frontiers in microbiology, 16:1540210.

In response to the growing threat posed by the spread of antimicrobial resistance in zoonotic Campylobacter, a One Health approach was used to examine the genomic diversity, phylogenomic relationships, and the distribution of genetic determinants of resistance (GDR) in C. jejuni and C. coli isolates from humans, animals (ruminants, swine, and chickens), and avian food products collected during a regionally (Basque Country, Spain) and temporally (mostly 2021-2022) restricted sampling. Eighty-three C. jejuni and seventy-one C. coli isolates, most exhibiting resistance to ciprofloxacin and/or erythromycin, were whole-genome sequenced using Oxford Nanopore Technologies long-fragment sequencing (ONT). Multilocus sequence typing (MLST) analysis identified a high genomic diversity among isolates. Phylogenomic analysis showed that clustering based on the core genome was aligned with MLST profiles, regardless of the sample source. In contrast, accessory genome content sometimes discriminated isolates within the same STs and occasionally differentiated isolates from different sources. The majority of the identified GDRs were present in isolates from different sources, and a good correlation was observed between GDR distribution and phenotypic susceptibility profiles (based on minimum inhibitory concentrations interpreted according to the EUCAST epidemiological cutoff values). Genotypic resistance profiles were independent of genotypes, indicating no apparent association between resistance and phylogenetic origin. This study demonstrates that ONT sequencing is a powerful tool for molecular surveillance of bacterial pathogens in the One Health framework.

RevDate: 2025-02-23
CmpDate: 2025-02-21

Hong H, Kang M, Haymowicz A, et al (2025)

Genetic characterization and in silico serotyping of 62 Salmonella enterica isolated from Korean poultry operations.

BMC genomics, 26(1):166.

BACKGROUND: The conventional method of antigen-based serotyping for Salmonella poses challenges due to the necessity of utilizing over 150 antisera. More recently, in silico Salmonella serotyping has emerged as a predictive alternative. The purpose of this study was to predict the serovars of 62 Salmonella enterica strains isolated from Korean poultry operations and their genetic characteristics using whole genome sequencing. The analysis employed diverse methods, including ribosomal, and core genome multi-locus sequence typing (MLST), based on Salmonella In Silico Typing Resource (SISTR). Pangenome, clusters of orthologous groups (COG) analysis, and identification of virulence and antibiotic resistance genes were conducted.

RESULTS: Salmonella enterica subspecies enterica serovars were observed and clustered based on the pangenome and phylogenetic tree: 21 Salmonella Albany (Albany), 13 Salmonella Bareilly (Bareilly), and 28 Salmonella Mbandaka (Mbandaka). The most frequently observed sequence types for the three serovars were ST292 in Albany, ST203 in Bareilly, and ST413 in Mbandaka. 18 antibiotic resistance genes showed varying presences based on the serovars, including Albany (qacEdelta1, tet(D), CARB-3 (blaCARB-3), and dfrA1) and Bareilly (aac(6')-ly). Intriguingly, a mutated gyrA (Ser83 → Phe, serine to phenylalanine) was observed in all 21 Albany strains, whereas Bareilly and Mbandaka carried the wild-type gyrA. Among 130 virulence genes analyzed, 107 were present in all 62 Salmonella strains, with Mbandaka strains exhibiting a higher prevalence of virulence genes related to fimbrial adherence compared to those of Albany and Bareilly.

CONCLUSIONS: The study identified distinct genetic characteristics among the three Salmonella serovars using whole genome sequencing. Albany carried a unique mutation in gyrA, occurring in the quinolone resistance-determining region. Additionally, the virulence gene profile of Mbandaka differed from the other serovars, particularly in fimbrial adherence genes. These findings demonstrate the effectiveness of in silico approaches in predicting Salmonella serovars and highlight genetic differences that may inform strategies for antibiotic resistance and virulence control, such as developing rapid diagnostic tools to detect the AMR (e.g. tet (D), and gyrA) or targeting serovar-specific virulence factors like fimbrial adherence genes in Mbandaka to mitigate pathogenicity.

RevDate: 2025-02-20

Secomandi S, Gallo GR, Rossi R, et al (2025)

Author Correction: Pangenome graphs and their applications in biodiversity genomics.

RevDate: 2025-02-20

Rahman MM, Siddique N, Gilman MAA, et al (2025)

Genomic and In Vitro Analysis of Pediococcus pentosaceus MBBL4 Implicated Its Therapeutic Use Against Mastitis Pathogens and as a Potential Probiotic.

Probiotics and antimicrobial proteins [Epub ahead of print].

Pediococcus pentosaceus has the potential to be used as probiotics and biologics amid rising trends of global antimicrobial resistance (AMR) and non-communicable diseases. This study analyzed the genome of P. pentosaceus MBBL4, isolated from healthy cow milk, to assess its probiotic properties and antimicrobial efficacy. The strain was subjected to whole genome sequencing (WGS), assembly, and annotations, alongside phylogenetic and comparative genomic analyses. Additionally, carbohydrate utilization, metabolic pathways, genomic safety, and probiotic potential of MBBL4 were assessed. Its in vitro antimicrobial efficacy against mastitis pathogens was also evaluated. The WGS analysis uncovered many important probiotic traits in MBBL4. Phylogenetic analysis demonstrated a close genetic link with other 15 P. pentosaceus strains, sharing more than 99% of core genes within the pan-genome matrix. MBBL4 demonstrated extensive range of carbohydrate metabolism activity, supported by the presence of several genes encoded enzymes, including a complete elucidated lactose metabolism pathway along with 28 additional metabolic pathway modules. Notably, its genome contains regions associated with gallic acid metabolism and related genes. MBBL4 also harbored genes encoding immunity proteins like enterocin A and lactococcin, and antimicrobial compounds including penocin A, lysozymes, laccase, colicin V, and viguiepinol. Comparative analysis with other probiotic strains revealed seven novel exopolysaccharide biosynthesis proteins and one biofilm-related protein. Moreover, MBBL4 remained sensitive to 90% of the tested antibiotics and carried only a single lincosamide resistance gene (lnuA). It effectively inhibited the growth of two important bovine mastitis pathogens, Staphylococcus aureus D4C4 and Escherichia coli G1C5. These results, along with its low pathogenicity score, support the safety profile of MBBL4 and highlight its potential as bioactive natural therapeutic for mastitis and a promising probiotic candidate.

RevDate: 2025-02-20
CmpDate: 2025-02-20

Lin YJ, Chen CH, Chang IY, et al (2025)

Genomic and transcriptomic insights into the virulence and adaptation of shock syndrome-causing Streptococcus anginosus.

Microbiology (Reading, England), 171(2):.

Streptococcus anginosus is a common isolate of the oral cavity and an opportunistic pathogen for systemic infections. Although the pyogenic infections caused by S. anginosus are similar to those caused by Streptococcus pyogenes, S. anginosus lacks most of the well-characterized virulence factors of S. pyogenes. To investigate the pathogenicity of S. anginosus, we analysed the genome of a newly identified S. anginosus strain, KH1, which was associated with toxic shock-like syndrome in an immunocompetent adolescent. The genome of KH1 contains nine genomic islands, two Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/CRISPR-associated systems and many phage-related proteins, indicating that the genome is influenced by prophages and horizontal gene transfer. Comparative genome analysis of 355 S. anginosus strains revealed a significant difference between the sizes of the pan genome and core genome, reflecting notable strain variations. We further analysed the transcriptomes of KH1 under conditions mimicking either the oral cavity or the bloodstream. We found that in an artificial saliva medium, the expression of a putative quorum quenching system and pyruvate oxidase for H2O2 production was upregulated, which could optimize the competitiveness of S. anginosus in the oral ecosystem. Conversely, in a modified serum medium, purine and glucan biosynthesis, competence and bacteriocin production were significantly upregulated, likely facilitating the survival of KH1 in the bloodstream. These findings indicate that S. anginosus can utilize diverse mechanisms to adapt to different environmental niches and establish infection, despite its lack of toxin production.

RevDate: 2025-02-20

Hu M, Wan P, Chen C, et al (2025)

Benchmarking, detection, and genotyping of structural variants in a population of whole-genome assemblies using the SVGAP pipeline.

bioRxiv : the preprint server for biology pii:2025.02.07.637096.

Comparisons of complete genome assemblies offer a direct procedure for characterizing all genetic differences among them. However, existing tools are often limited to specifi c aligners or optimized for specifi c organisms, narrowing their applicability, particularly for large and repetitive plant genomes. Here, we introduce SVGAP, a pipeline for structural variant (SV) discovery, genotyping, and annotation from high-quality genome assemblies at the population level. Through extensive benchmarks using simulated SV datasets at individual, population, and phylogenetic contexts, we demonstrate that SVGAP performs favorably relative to existing tools in SV discovery. Additionally, SVGAP is one of the few tools to address the challenge of genotyping SVs within large assembled genome samples, and it generates fully genotyped VCF fi les. Applying SVGAP to 26 maize genomes revealed hidden genomic diversity in centromeres, driven by abundant insertions of centromere-specifi c LTR-retrotransposons. The output of SVGAP is well-suited for pan-genome construction and facilitates the interpretation of previously unexplored genomic regions.

RevDate: 2025-02-20

Khan J, Dhulipala L, R Patro (2025)

Fast and Scalable Parallel External-Memory Construction of Colored Compacted de Bruijn Graphs with Cuttlefish 3.

bioRxiv : the preprint server for biology pii:2025.02.02.636161.

The rapid growth of genomic data over the past decade has made scalable and efficient sequence analysis algorithms, particularly for constructing de Bruijn graphs and their colored and compacted variants critical components of many bioinformatics pipelines. Colored compacted de Bruijn graphs condense repetitive sequence information, significantly reducing the data burden on downstream analyses like assembly, indexing, and pan-genomics. However, direct construction of these graphs is necessary as constructing the original uncompacted graph is essentially infeasible at large scale. In this paper, we introduce C uttlefish 3, a state-of-the-art parallel, external-memory algorithm for constructing (colored) compacted de Bruijn graphs. C uttlefish 3 introduces novel algorithmic improvements that provide its scalability and speed, including optimizations to significantly speed up local contractions within subgraphs, a parallel algorithm to join local solutions based on parallel list-ranking, and a sparsification method to vastly reduce the amount of data required to compute the colored graph. Leveraging these algorithmic strategies along with algorithm engineering optimizations in parallel and external-memory setting, C uttlefish 3 demonstrates state-of-the-art performance, surpassing existing approaches in speed and scalability across various genomic datasets in both colored and uncolored scenarios.

RevDate: 2025-02-20

Dishuck PC, Munson KM, Lewis AP, et al (2025)

Structural variation, selection, and diversification of the NPIP gene family from the human pangenome.

bioRxiv : the preprint server for biology pii:2025.02.04.636496.

The NPIP (nuclear pore interacting protein) gene family has expanded to high copy number in humans and African apes where it has been subject to an excess of amino acid replacement consistent with positive selection (1). Due to the limitations of short-read sequencing, NPIP human genetic diversity has been poorly understood. Using highly accurate assemblies generated from long-read sequencing as part of the human pangenome, we completely characterize 169 human haplotypes (4,665 NPIP paralogs and alleles). Of the 28 NPIP paralogs, just three (NPIPB2 , B11 , and B14) are fixed at a single copy, and only a single locus, B2 , shows no structural variation. Four NPIP paralogs map to large segmental duplication blocks that mediate polymorphic inversions (355 kbp-1.6 Mbp) corresponding to microdeletions associated with developmental delay and autism. Haplotype-based tests of positive selection and selective sweeps identify two paralogs, B9 and B15 , within the top percentile for both tests. Using full-length cDNA data from 101 tissue/cell types, we construct paralog-specific gene models and show that 56% (31/55 most abundant isoforms) have not been previously described in RefSeq. We define six distinct translation start sites and other protein structural features that distinguish paralogs, including a variable number tandem repeat that encodes a beta helix of variable size that emerged ∼3.1 million years ago in human evolution. Among the 28 NPIP paralogs, we identify distinct tissue and developmental patterns of expression with only a few maintaining the ancestral testis-enriched expression. A subset of paralogs (NPIPA1 , A5 , A6-9 , B3-5 , and B12/B13) show increased brain expression. Our results suggest ongoing positive selection in the human population and rapid diversification of NPIP gene models.

RevDate: 2025-02-20

Sanaullah A, Villalobos S, Zhi D, et al (2025)

Haplotype Matching with GBWT for Pangenome Graphs.

bioRxiv : the preprint server for biology pii:2025.02.03.634410.

Traditionally, variations from a linear reference genome were used to represent large sets of haplotypes compactly. In the linear reference genome based paradigm, the positional Burrows-Wheeler transform (PBWT) has traditionally been used to perform efficient haplotype matching. Pangenome graphs have recently been proposed as an alternative to linear reference genomes for representing the full spectrum of variations in the human genome. However, haplotype matches in pangenome graph based haplotype sets are not trivially generalizable from haplotype matches in the linear reference genome based haplotype sets. Work has been done to represent large sets of haplotypes as paths through a pangenome graph. The graph Burrows-Wheeler transform (GBWT) is one such work. The GBWT essentially stores the haplotype paths in a run length compressed BWT with compressed local alphabets. Although efficient in practice count and locate queries on the GBWT were provided by the original authors, the efficient haplotype matching capabilities of the PBWT have never been shown on the GBWT. In this paper, we formally define the notion of haplotype matches in pangenome graph-based haplotype sets by generalizing from haplotype matches in linear reference genome-based haplotype sets. We also describe the relationship between set maximal matches, long matches, locally maximal matches, and text maximal matches on the GBWT, PBWT, and the BWT. We provide algorithms for outputting some of these matches by applying the data structures of the r-index (introduced by Gagie et al.) to the GBWT. We show that these structures enable set maximal match and long match queries on the GBWT in almost linear time and in space close to linear in the number of runs in the GBWT. We also provide multiple versions of the query algorithms for different combinations of the available data structures. The long match query algorithms presented here even run on the BWT in the same time complexity as the GBWT due to their similarity.

RevDate: 2025-02-20

Navasca A, Singh J, Rivera-Varas V, et al (2025)

Dispensable genome and segmental duplications drive the genome plasticity in Fusarium solani.

Frontiers in fungal biology, 6:1432339.

Fusarium solani is a species complex encompassing a large phylogenetic clade with diverse members occupying varied habitats. We recently reported a unique opportunistic F. solani associated with unusual dark galls in sugarbeet. We assembled the chromosome-level genome of the F. solani sugarbeet isolate strain SB1 using Oxford Nanopore and Hi-C sequencing. The average size of F. solani genomes is 54 Mb, whereas SB1 has a larger genome of 59.38 Mb, organized into 15 chromosomes. The genome expansion of strain SB1 is due to the high repeats and segmental duplications within its three potentially accessory chromosomes. These chromosomes are absent in the closest reference genome with chromosome-level assembly, F. vanettenii 77-13-4. Segmental duplications were found in three chromosomes but are most extensive between two specific SB1 chromosomes, suggesting that this isolate may have doubled its accessory genes. Further comparison of the F. solani strain SB1 genome demonstrates inversions and syntenic regions to an accessory chromosome of F. vanettenii 77-13-4. The pan-genome of 12 publicly available F. solani isolates nearly reached gene saturation, with few new genes discovered after the addition of the last genome. Based on orthogroups and average nucleotide identity, F. solani is not grouped by lifestyle or origin. The pan-genome analysis further revealed the enrichment of several enzymes-coding genes within the dispensable (accessory + unique genes) genome, such as hydrolases, transferases, oxidoreductases, lyases, ligases, isomerase, and dehydrogenase. The evidence presented here suggests that genome plasticity, genetic diversity, and adaptive traits in Fusarium solani are driven by the dispensable genome with significant contributions from segmental duplications.

RevDate: 2025-02-20

Magome TG, Surleac M, Hassim A, et al (2025)

Decoding the anomalies: a genome-based analysis of Bacillus cereus group strains closely related to Bacillus anthracis.

Frontiers in microbiology, 16:1527049.

INTRODUCTION: The Bacillus cereus group encompasses a complex group of closely related pathogenic and non-pathogenic bacterial species. Key members include B. anthracis, B. cereus, and B. thuringiensis organisms that, despite genetic proximity, diverge significantly in morphology and pathogenic potential. Taxonomic challenges persist due to inconsistent classification methods, particularly for B. cereus isolates that resemble B. anthracis in genetic clustering.

METHODS: This study investigated B. cereus group isolates from blood smears of animal carcasses in Kruger National Park, uncovering an unusual isolate with B. cereus features based on classical microbiological tests yet B. anthracis-like genomic similarities with an Average Nucleotide Identity (ANI) of ≥95%. Using comparative genomics, pan-genomics and whole genome Single Nucleotide Polymorphism (wgSNP) analysis, a total of 103 B. cereus group genomes were analyzed, including nine newly sequenced isolates from South Africa and a collection of isolates that showed some classification discrepancies, thus classified as "anomalous."

RESULTS AND DISCUSSION: Of the 36 strains identified as B. anthracis in GenBank, 26 clustered phylogenetically with the four confirmed B. anthracis isolates from South Africa and shared 99% ANI. Isolates with less than 99% ANI alignment to B. anthracis exhibited characteristics consistent with B. cereus and/or B. thuringiensis, possessing diverse genetic profiles, insertion elements, resistance genes, and virulence genes features, contrasting with the genetic uniformity of typical B. anthracis. The findings underscore a recurrent acquisition of mobile genetic elements within B. cereus and B. thuringiensis, a process infrequent in B. anthracis.

CONCLUSION: This study highlights the pressing need for standardized taxonomic criteria in B. cereus group classification, especially as anomalous isolates emerge. This study supports the existing nomenclature framework which offers an effective solution for classifying species into genomospecies groups. We recommend isolates with ANI ≥99% to standard reference B. anthracis be designated as typical B. anthracis in GenBank to maintain taxonomic clarity and precision.

RevDate: 2025-02-19

Mallappa A, Kuralayanapalya Puttahonnappa S, Shome R, et al (2025)

Systematic review, Meta-analysis, and Pan-genome analytics predict the surging of Brucella melitensis by China and India-specific strains, elucidating the demand for enhanced preparedness.

Journal of infection and public health, 18(4):102693 pii:S1876-0341(25)00042-5 [Epub ahead of print].

BACKGROUND: Brucellosis is an infectious disease in lower to moderate-income countries. It primarily affects small ruminant (sheep and goat) populations and can also be transmitted to mammals (humans). Brucella melitensis (B. melitensis) is the primary cause, posing a zoonotic threat. Controlling the spread of B. melitensis, especially in regions with high risk to humans and small ruminants, remains challenging. Current research explores the prevalence, genetic diversity, and prediction of brucellosis transmission in ruminants and humans.

METHODS: In this study, we developed an integrated database providing information on B. melitensis incidence in livestock from 2003 to 2024 and a systematic review and meta-analysis to assess the prevalence by following the Cochran collaborators' Preferred Reporting Items for Systematic Reviews and Meta-analysis guidelines. A comprehensive literature search was conducted using reputable sources. These included reputable sources of electronic databases such as PubMed, ScienceDirect, Scopus, Biomed Central, CeRA, Krishikosh, ProQuest Dissertations & Theses Global, and Web of Science, complemented by the Google Scholar search engine. We also utilized Zotero 5.0 and Rayyan QCR, two web-based tools. Time series model to predict incidence trends and pan-genomic analysis to determine genetic diversity across Asia and Africa.

RESULTS: Meta-analysis revealed an overall prevalence of 12 % of which the African continent rose at 7 % (95 % CI: 5-8 %, I[2] = 99 %, τ[2] = 0.03, P = 0), while the corresponding prevalence in the Asian continent constituted 12 % (95 % CI: 11-14 %, I[2] = 99 %, τ[2] = 0.02, P = 0). The Time series model predicts a rising trend in brucellosis incidence from 2023 to 2030. The pan-genome analysis identified Rev 1 (0.000712) strain from China and the CIIMS-PH-3 (0.000209) strains from India showed the highest branch length, considered to have more genetic diversity.

CONCLUSION: These findings underscore the critical need for ongoing surveillance models and research to monitor the evolving B. melitensis landscape. High-prevalence regions exhibit significant genetic diversity. Effective prevention & control and response & preparedness strategies, including precise detection through advanced diagnostics, robust surveillance models to track trends, and targeted vaccination of susceptible animals, are vital. Stringent quarantine protocols, biosecurity measures, and exploring herbal remedies as a complementary approach to conventional treatment are crucial to mitigate the brucellosis burden as a public health concern and its socioeconomic impact on livelihood.

RevDate: 2025-02-19
CmpDate: 2025-02-19

Payne CJ, Phuong VH, Phuoc NN, et al (2025)

Genomic diversity and evolutionary patterns of Edwardsiella ictaluri affecting farmed striped catfish (Pangasianodon hypophthalmus) in Vietnam over 20 years.

Microbial genomics, 11(2):.

Edwardsiella ictaluri continues to pose a significant risk to the health and production of striped catfish (Pangasianodon hypophthalmus) in Vietnam. Whilst recent advances in genomic sequencing provide an insight into the global genomic diversity of this important fish pathogen, genome-wide analysis of Vietnamese isolates recovered over time is lacking. In this study, we used a whole-genome sequencing approach to compare the genomes of 31 E. ictaluri isolates recovered over a 20-year period (2001-2021) and performed comparative genomic analysis to explore temporal changes in genome diversity, population structure and mechanisms driving pathogenesis and antimicrobial resistance. Our findings revealed an open pan-genome with 4148 genes and a core genome (3 060 genes) accounting for over two-thirds of the genome. Moreover, we found the genomes sequenced to classify into two distinct lineages and estimated the ancestral origin of these lineages within Vietnam to date back to the 1950s. Plasmids were highly prevalent in Vietnamese E. ictaluri, with isolates harbouring up to four plasmids within their genome. Further, a diverse mobilome was observed with nine different plasmid types detected across the genome collection. Exploration of putative plasmids revealed a diverse set of antimicrobial resistance genes (ARGs) against key antibiotics used in Vietnamese aquaculture and virulence genes associated with protein secretion systems. Correlation analysis revealed the total number of ARGs detected in genomes to increase with isolate recovery time. Whilst the number of virulence genes remained relatively stable, temporal variation was noted in several virulence factors related to motility and immune system modulation. Findings from this study highlight the need for continued genomic surveillance to monitor changes in antimicrobial resistance and pathogenesis, to help inform the development of disease control and management strategies.

RevDate: 2025-02-19

Adeniji AA, Chukwuneme CF, Conceição EC, et al (2025)

Unveiling novel features and phylogenomic assessment of indigenous Priestia megaterium AB-S79 using comparative genomics.

Microbiology spectrum [Epub ahead of print].

UNLABELLED: Priestia megaterium strain AB-S79 isolated from active gold mine soil previously expressed in vitro heavy metal resistance and has a 5.7 Mb genome useful for biotechnological exploitation. This study used web-based bioinformatic resources to analyze P. megaterium AB-S79 genomic relatedness, decipher its secondary metabolite biosynthetic gene clusters (BGCs), and better comprehend its taxa. Genes were highly conserved across the 14 P. megaterium genomes examined here. The pangenome reflected a total of 61,397 protein-coding genes, 59,745 homolog protein family hits, and 1,652 singleton protein family hits. There were also 7,735 protein families, including 1,653 singleton families and 6,082 homolog families. OrthoVenn3 comparison of AB-S79 protein sequences with 13 other P. megaterium strains, 7 other Priestia spp., and 6 other Bacillus spp. highlighted AB-S79's unique genomic and evolutionary trait. antiSMASH identified two key transcription factor binding site regulators in AB-S79's genome: zinc-responsive repressor (Zur) and antibiotic production activator (AbrC3), plus putative enzymes for the biosynthesis of terpenes and ranthipeptides. AB-S79 also harbors BGCs for two unique siderophores (synechobactins and schizokinens), phosphonate, dienelactone hydrolase family protein, and phenazine biosynthesis protein (phzF), which is significant for this study. Phosphonate particularly showed specificity for the P. megaterium sp. validating the effect of gene family expansion and contraction. P. megaterium AB-S79 looks to be a viable source for value-added compounds. Thus, this study contributes to the theoretical framework for the systematic metabolic and genetic exploitation of the P. megaterium sp., particularly the value-yielding strains.

IMPORTANCE: This study explores microbial natural product discovery using genome mining, focusing on Priestia megaterium. Key findings highlight the potential of P. megaterium, particularly strain AB-S79, for biotechnological applications. The research shows a limited output of P. megaterium genome sequences from Africa, emphasizing the importance of the native strain AB-S79. Additionally, the study underlines the strain's diverse metabolic capabilities, reinforcing its suitability as a model for microbial cell factories and its foundational role in future biotechnological exploitation.

LOAD NEXT 100 CITATIONS

ESP Quick Facts

ESP Origins

In the early 1990's, Robert Robbins was a faculty member at Johns Hopkins, where he directed the informatics core of GDB — the human gene-mapping database of the international human genome project. To share papers with colleagues around the world, he set up a small paper-sharing section on his personal web page. This small project evolved into The Electronic Scholarly Publishing Project.

ESP Support

In 1995, Robbins became the VP/IT of the Fred Hutchinson Cancer Research Center in Seattle, WA. Soon after arriving in Seattle, Robbins secured funding, through the ELSI component of the US Human Genome Project, to create the original ESP.ORG web site, with the formal goal of providing free, world-wide access to the literature of classical genetics.

ESP Rationale

Although the methods of molecular biology can seem almost magical to the uninitiated, the original techniques of classical genetics are readily appreciated by one and all: cross individuals that differ in some inherited trait, collect all of the progeny, score their attributes, and propose mechanisms to explain the patterns of inheritance observed.

ESP Goal

In reading the early works of classical genetics, one is drawn, almost inexorably, into ever more complex models, until molecular explanations begin to seem both necessary and natural. At that point, the tools for understanding genome research are at hand. Assisting readers reach this point was the original goal of The Electronic Scholarly Publishing Project.

ESP Usage

Usage of the site grew rapidly and has remained high. Faculty began to use the site for their assigned readings. Other on-line publishers, ranging from The New York Times to Nature referenced ESP materials in their own publications. Nobel laureates (e.g., Joshua Lederberg) regularly used the site and even wrote to suggest changes and improvements.

ESP Content

When the site began, no journals were making their early content available in digital format. As a result, ESP was obliged to digitize classic literature before it could be made available. For many important papers — such as Mendel's original paper or the first genetic map — ESP had to produce entirely new typeset versions of the works, if they were to be available in a high-quality format.

ESP Help

Early support from the DOE component of the Human Genome Project was critically important for getting the ESP project on a firm foundation. Since that funding ended (nearly 20 years ago), the project has been operated as a purely volunteer effort. Anyone wishing to assist in these efforts should send an email to Robbins.

ESP Plans

With the development of methods for adding typeset side notes to PDF files, the ESP project now plans to add annotated versions of some classical papers to its holdings. We also plan to add new reference and pedagogical material. We have already started providing regularly updated, comprehensive bibliographies to the ESP.ORG site.

Electronic Scholarly Publishing
961 Red Tail Lane
Bellingham, WA 98226

E-mail: RJR8222 @ gmail.com

Papers in Classical Genetics

The ESP began as an effort to share a handful of key papers from the early days of classical genetics. Now the collection has grown to include hundreds of papers, in full-text format.

Digital Books

Along with papers on classical genetics, ESP offers a collection of full-text digital books, including many works by Darwin and even a collection of poetry — Chicago Poems by Carl Sandburg.

Timelines

ESP now offers a large collection of user-selected side-by-side timelines (e.g., all science vs. all other categories, or arts and culture vs. world history), designed to provide a comparative context for appreciating world events.

Biographies

Biographical information about many key scientists (e.g., Walter Sutton).

Selected Bibliographies

Bibliographies on several topics of potential interest to the ESP community are automatically maintained and generated on the ESP site.

ESP Picks from Around the Web (updated 28 JUL 2024 )