Joint semiparametric kernel network regression

© 2023 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd..

Variable selection and graphical modeling play essential roles in highly correlated and high-dimensional (HCHD) data analysis. Variable selection methods have been developed under both parametric and nonparametric model settings. However, variable selection for nonadditive, nonparametric regression with high-dimensional variables is challenging due to complications in modeling unknown dependence structures among HCHD variables. Gaussian graphical models are a popular and useful tool for investigating the conditional dependence between variables via estimating sparse precision matrices. For a given class of interest, the estimated precision matrices can be mapped onto networks for visualization. However, the limitation of Gaussian graphical models is that they are only applicable to discretized response variables and for the case when p log ( p ) ≪ n $$ p\log (p)\ll n $$ , where p $$ p $$ is the number of variables and n $$ n $$ is the sample size. They are necessary to develop a joint method for variable selection and graphical modeling. To the best of our knowledge, the methods for simultaneously selecting variable selection and estimating networks among variables in the semiparametric regression settings are quite limited. Hence, in this paper, we develop a joint semiparametric kernel network regression method to solve this limitation and to provide a connection between them. Our approach is a unified and integrated method that can simultaneously identify important variables and build a network among those variables. We developed our approach under a semiparametric kernel machine regression framework, which can allow for nonlinear or nonadditive associations and complicated interactions among the variables. The advantages of our approach are that it can (1) simultaneously select variables and build a network among HCHD variables under a regression setting; (2) model unknown and complicated interactions among the variables and estimate the network among these variables; (3) allow for any form of semiparametric model, including non-additive, nonparametric model; and (4) provide an interpretable network that considers important variables and a response variable. We demonstrate our approach using a simulation study and real application on genetic pathway-based analysis.

Medienart:

E-Artikel

Erscheinungsjahr:

2023

Erschienen:

2023

Enthalten in:

Zur Gesamtaufnahme - volume:42

Enthalten in:

Statistics in medicine - 42(2023), 28 vom: 10. Dez., Seite 5247-5265

Sprache:

Englisch

Beteiligte Personen:

Kim, Byung-Jun [VerfasserIn]
Kim, Inyoung [VerfasserIn]

Links:

Volltext

Themen:

Graphical model
Journal Article
Least square kernel machine
Semiparametric model

Anmerkungen:

Date Completed 14.02.2024

Date Revised 14.02.2024

published: Print-Electronic

Citation Status MEDLINE

doi:

10.1002/sim.9910

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM362214492