Details der Publikation - Deep advantage learning for optimal dynamic treatment regime

Deep advantage learning for optimal dynamic treatment regime

Recently deep learning has successfully achieved state-of-the-art performance on many difficult tasks. Deep neural network outperforms many existing popular methods in the field of reinforcement learning. It can also identify important covariates automatically. Parameter sharing of convolutional neural network (CNN) greatly reduces the amount of parameters in the neural network, which allows for high scalability. However few research has been done on deep advantage learning (A-learning). In this paper, we present a deep A-learning approach to estimate optimal dynamic treatment regime. A-learning models the advantage function, which is of direct relevance to the goal. We use an inverse probability weighting (IPW) method to estimate the difference between potential outcomes, which does not require to make any model assumption on the baseline mean function. We implemented different architectures of deep CNN and convexified convolutional neural networks (CCNN). The proposed deep A-learning methods are applied to a data from the STAR*D trial and are shown to have better performance compared with the penalized least square estimator using a linear decision rule.

Medienart:	E-Artikel

Erscheinungsjahr:	2018
Erschienen:	2018

Enthalten in:	Zur Gesamtaufnahme - volume:2
Enthalten in:	Statistical theory and related fields - 2(2018), 1 vom: 12., Seite 80-88

Sprache:	Englisch

Beteiligte Personen:	Liang, Shuhan [VerfasserIn] Lu, Wenbin [VerfasserIn] Song, Rui [VerfasserIn]

Links:	Volltext

Themen:	Advantage Learning Convexified Convolutional Neural Networks Convolutional Neural Networks Dynamic Treatment Regime Inverse Probability Weighting Journal Article

Anmerkungen:	Date Revised 03.04.2024 published: Print-Electronic Citation Status PubMed-not-MEDLINE

doi:	10.1080/24754269.2018.1466096

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM290535425

Internformat


LEADER	01000caa a22002652 4500
001	NLM290535425
003	DE-627
005	20240403232010.0
007	cr uuu---uuuuu
008	231225s2018 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1080/24754269.2018.1466096 \|2 doi
028	5	2	\|a pubmed24n1362.xml
035			\|a (DE-627)NLM290535425
035			\|a (NLM)30420972
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Liang, Shuhan \|e verfasserin \|4 aut
245	1	0	\|a Deep advantage learning for optimal dynamic treatment regime
264		1	\|c 2018
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 03.04.2024
500			\|a published: Print-Electronic
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Recently deep learning has successfully achieved state-of-the-art performance on many difficult tasks. Deep neural network outperforms many existing popular methods in the field of reinforcement learning. It can also identify important covariates automatically. Parameter sharing of convolutional neural network (CNN) greatly reduces the amount of parameters in the neural network, which allows for high scalability. However few research has been done on deep advantage learning (A-learning). In this paper, we present a deep A-learning approach to estimate optimal dynamic treatment regime. A-learning models the advantage function, which is of direct relevance to the goal. We use an inverse probability weighting (IPW) method to estimate the difference between potential outcomes, which does not require to make any model assumption on the baseline mean function. We implemented different architectures of deep CNN and convexified convolutional neural networks (CCNN). The proposed deep A-learning methods are applied to a data from the STAR*D trial and are shown to have better performance compared with the penalized least square estimator using a linear decision rule
650		4	\|a Journal Article
650		4	\|a Advantage Learning
650		4	\|a Convexified Convolutional Neural Networks
650		4	\|a Convolutional Neural Networks
650		4	\|a Dynamic Treatment Regime
650		4	\|a Inverse Probability Weighting
700	1		\|a Lu, Wenbin \|e verfasserin \|4 aut
700	1		\|a Song, Rui \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t Statistical theory and related fields \|d 2017 \|g 2(2018), 1 vom: 12., Seite 80-88 \|w (DE-627)NLM277501288 \|x 2475-4277 \|7 nnns
773	1	8	\|g volume:2 \|g year:2018 \|g number:1 \|g day:12 \|g pages:80-88
856	4	0	\|u http://dx.doi.org/10.1080/24754269.2018.1466096 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|d 2 \|j 2018 \|e 1 \|b 12 \|h 80-88

Deep advantage learning for optimal dynamic treatment regime

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände