TY - CHAP
T1 - Differential Expression, Functional and Machine Learning Analysis of High-Throughput –Omics Data Using Open-Source Tools
AU - Kebschull, Moritz
AU - Kroeger, Annika Therese
AU - Papapanou, Panos N.
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2023
Y1 - 2023
N2 - Today, –omics analyses, including the systematic cataloging of messenger RNA and microRNA sequences or DNA methylation patterns in a cell population, organ or tissue sample, allow for an unbiased, comprehensive genome-level analysis of complex diseases, offering a large advantage over earlier “candidate” gene or pathway analyses. A primary goal in the analysis of these high-throughput assays is the detection of those features among several thousand that differ between different groups of samples. In the context of oral biology, our group has successfully utilized –omics technology to identify key molecules and pathways in different diagnostic entities of periodontal disease. A major issue when inferring biological information from high-throughput –omics studies is the fact that the sheer volume of high-dimensional data generated by contemporary technology is not appropriately analyzed using common statistical methods employed in the biomedical sciences. Furthermore, machine learning methods facilitate the detection of additional patterns, beyond the mere identification of lists of features that differ between groups. Herein, we outline a robust and well-accepted bioinformatics workflow for the initial analysis of –omics data using open-source tools. We outline a differential expression analysis pipeline that can be used for data from both arrays and sequencing experiments, and offers the possibility to account for random or fixed effects. Furthermore, we present an overview of the possibilities for a functional analysis of the obtained data including subsequent machine learning approaches in form of (i) supervised classification algorithms in class validation and (ii) unsupervised clustering in class discovery.
AB - Today, –omics analyses, including the systematic cataloging of messenger RNA and microRNA sequences or DNA methylation patterns in a cell population, organ or tissue sample, allow for an unbiased, comprehensive genome-level analysis of complex diseases, offering a large advantage over earlier “candidate” gene or pathway analyses. A primary goal in the analysis of these high-throughput assays is the detection of those features among several thousand that differ between different groups of samples. In the context of oral biology, our group has successfully utilized –omics technology to identify key molecules and pathways in different diagnostic entities of periodontal disease. A major issue when inferring biological information from high-throughput –omics studies is the fact that the sheer volume of high-dimensional data generated by contemporary technology is not appropriately analyzed using common statistical methods employed in the biomedical sciences. Furthermore, machine learning methods facilitate the detection of additional patterns, beyond the mere identification of lists of features that differ between groups. Herein, we outline a robust and well-accepted bioinformatics workflow for the initial analysis of –omics data using open-source tools. We outline a differential expression analysis pipeline that can be used for data from both arrays and sequencing experiments, and offers the possibility to account for random or fixed effects. Furthermore, we present an overview of the possibilities for a functional analysis of the obtained data including subsequent machine learning approaches in form of (i) supervised classification algorithms in class validation and (ii) unsupervised clustering in class discovery.
UR - http://www.scopus.com/inward/record.url?scp=85142720162&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142720162&partnerID=8YFLogxK
U2 - 10.1007/978-1-0716-2780-8_19
DO - 10.1007/978-1-0716-2780-8_19
M3 - Chapter
C2 - 36418696
AN - SCOPUS:85142720162
T3 - Methods in Molecular Biology
SP - 317
EP - 351
BT - Methods in Molecular Biology
PB - Humana Press Inc.
ER -