Abstract
Background
The analysis of ortholog genes derived from metagenomic experiments provides an invaluable opportunity to assess the functional role of microbial communities towards, for example, antimicrobial resistance or biochemical pathways under different experimental conditions. Nevertheless, the integration of the statistical analysis of these complex data sets and the enrichment of the derived significantly differential abundant orthologs is not currently facilitated by existing software. Genomica is an R package that, with minimal input from the user, allows to perform a double-step analysis of functional orthologs from the KEGG Orthology. The pipeline is carried out via combining false discovery rate corrected linear mixed models to functional enrichment analysis through integrating established R pipelines (i.e., lme4 and MicrobiomeProfiler).
Results
Only two data frames are needed as input to run Genomica, which contain data and metadata, respectively. The fast pipeline integrated within the function Genomica allows to analyze 4000 orthologs in circa 3 min. The outputs are collected in a single directory, containing publication-ready results from the linear mixed model and from the enrichment analysis. The Benjamini & Hochberg correction is applied to the results from the linear mixed model, therefore only P adjusted significant comparisons are further included in the enrichment analysis.
Conclusions
Genomica is a simple-to-use R package to analyze complex datasets, integrating a well-founded statistical analysis, accounting for the calculation of the type I error under repeated testing, with the enrichment analysis of the significantly differential abundant orthologs across experimental conditions, all with minimal input from the user.
The analysis of ortholog genes derived from metagenomic experiments provides an invaluable opportunity to assess the functional role of microbial communities towards, for example, antimicrobial resistance or biochemical pathways under different experimental conditions. Nevertheless, the integration of the statistical analysis of these complex data sets and the enrichment of the derived significantly differential abundant orthologs is not currently facilitated by existing software. Genomica is an R package that, with minimal input from the user, allows to perform a double-step analysis of functional orthologs from the KEGG Orthology. The pipeline is carried out via combining false discovery rate corrected linear mixed models to functional enrichment analysis through integrating established R pipelines (i.e., lme4 and MicrobiomeProfiler).
Results
Only two data frames are needed as input to run Genomica, which contain data and metadata, respectively. The fast pipeline integrated within the function Genomica allows to analyze 4000 orthologs in circa 3 min. The outputs are collected in a single directory, containing publication-ready results from the linear mixed model and from the enrichment analysis. The Benjamini & Hochberg correction is applied to the results from the linear mixed model, therefore only P adjusted significant comparisons are further included in the enrichment analysis.
Conclusions
Genomica is a simple-to-use R package to analyze complex datasets, integrating a well-founded statistical analysis, accounting for the calculation of the type I error under repeated testing, with the enrichment analysis of the significantly differential abundant orthologs across experimental conditions, all with minimal input from the user.
| Original language | English |
|---|---|
| Journal | BMC Bioinformatics |
| Early online date | 18 Apr 2026 |
| DOIs | |
| Publication status | First published - 18 Apr 2026 |
Keywords
- Metagenomics
- Bioinformatics
- Microbiology
- Microbiota
- Ortholog
- Statistical analysis
- Enrichment
- Differential anlaysis
- Antimicrobial resistance (AMR)
Fingerprint
Dive into the research topics of 'Genomica: linear mixed model based, multiple hypothesis testing corrected, ortholog functional enrichment analysis'. Together they form a unique fingerprint.Projects
- 1 Active
-
RESAS 22-27: SRUC-b6-2 A Systems Understanding Of The Flow Of Amr From Livestock Production To The Environment And Humans: Informing Risk Analyses
Hutchings, M. (PI), Baker, E. (CoI), Galgano, S. (CoI), Pollock, J. (CoI), Holden, N. (CoI), Smith, L. (CoI) & Lowe, A. (CoI)
Scottish Government: Rural & Environment Science & Analytical Services
1/04/22 → 31/03/27
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver