Details der Publikation - A look inside the black box

A look inside the black box : Using graph-theoretical descriptors to interpret a Continuous-Filter Convolutional Neural Network (CF-CNN) trained on the global and local minimum energy structures of neutral water clusters

We describe a method for the post-hoc interpretation of a neural network (NN) trained on the global and local minima of neutral water clusters. We use the structures recently reported in a newly published database containing over 5 × 106 unique water cluster networks (H2O)N of size N = 3-30. The structural properties were first characterized using chemical descriptors derived from graph theory, identifying important trends in topology, connectivity, and polygon structure of the networks associated with the various minima. The code to generate the molecular graphs and compute the descriptors is available at https://github.com/exalearn/molecular-graph-descriptors, and the graphs are available alongside the original database at https://sites.uw.edu/wdbase/. A Continuous-Filter Convolutional Neural Network (CF-CNN) was trained on a subset of 500 000 networks to predict the potential energy, yielding a mean absolute error of 0.002 ± 0.002 kcal/mol per water molecule. Clusters of sizes not included in the training set exhibited errors of the same magnitude, indicating that the CF-CNN protocol accurately predicts energies of networks for both smaller and larger sizes than those used during training. The graph-theoretical descriptors were further employed to interpret the predictive power of the CF-CNN. Topological measures, such as the Wiener index, the average shortest path length, and the similarity index, suggested that all networks from the test set were within the range of values as the ones from the training set. The graph analysis suggests that larger errors appear when the mean degree and the number of polygons in the cluster lie further from the mean of the training set. This indicates that the structural space, and not just the chemical space, is an important factor to consider when designing training sets, as predictive errors can result when the structural composition is sufficiently different from the bulk of those in the training set. To this end, the developed descriptors are quite effective in explaining the results of the CF-CNN (a.k.a. the "black box") model.

Medienart:	E-Artikel

Erscheinungsjahr:	2020
Erschienen:	2020

Enthalten in:	Zur Gesamtaufnahme - volume:153
Enthalten in:	The Journal of chemical physics - 153(2020), 2 vom: 14. Juli, Seite 024302

Sprache:	Englisch

Beteiligte Personen:	Bilbrey, Jenna A [VerfasserIn] Heindel, Joseph P [VerfasserIn] Schram, Malachi [VerfasserIn] Bandyopadhyay, Pradipta [VerfasserIn] Xantheas, Sotiris S [VerfasserIn] Choudhury, Sutanay [VerfasserIn]

Links:	Volltext

Themen:	Journal Article

Anmerkungen:	Date Revised 16.07.2020 published: Print Citation Status PubMed-not-MEDLINE

doi:	10.1063/5.0009933

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM312453701

Internformat


LEADER	01000naa a22002652 4500
001	NLM312453701
003	DE-627
005	20231225144440.0
007	cr uuu---uuuuu
008	231225s2020 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.1063/5.0009933 \|2 doi
028	5	2	\|a pubmed24n1041.xml
035			\|a (DE-627)NLM312453701
035			\|a (NLM)32668919
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Bilbrey, Jenna A \|e verfasserin \|4 aut
245	1	2	\|a A look inside the black box \|b Using graph-theoretical descriptors to interpret a Continuous-Filter Convolutional Neural Network (CF-CNN) trained on the global and local minimum energy structures of neutral water clusters
264		1	\|c 2020
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 16.07.2020
500			\|a published: Print
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a We describe a method for the post-hoc interpretation of a neural network (NN) trained on the global and local minima of neutral water clusters. We use the structures recently reported in a newly published database containing over 5 × 106 unique water cluster networks (H2O)N of size N = 3-30. The structural properties were first characterized using chemical descriptors derived from graph theory, identifying important trends in topology, connectivity, and polygon structure of the networks associated with the various minima. The code to generate the molecular graphs and compute the descriptors is available at https://github.com/exalearn/molecular-graph-descriptors, and the graphs are available alongside the original database at https://sites.uw.edu/wdbase/. A Continuous-Filter Convolutional Neural Network (CF-CNN) was trained on a subset of 500 000 networks to predict the potential energy, yielding a mean absolute error of 0.002 ± 0.002 kcal/mol per water molecule. Clusters of sizes not included in the training set exhibited errors of the same magnitude, indicating that the CF-CNN protocol accurately predicts energies of networks for both smaller and larger sizes than those used during training. The graph-theoretical descriptors were further employed to interpret the predictive power of the CF-CNN. Topological measures, such as the Wiener index, the average shortest path length, and the similarity index, suggested that all networks from the test set were within the range of values as the ones from the training set. The graph analysis suggests that larger errors appear when the mean degree and the number of polygons in the cluster lie further from the mean of the training set. This indicates that the structural space, and not just the chemical space, is an important factor to consider when designing training sets, as predictive errors can result when the structural composition is sufficiently different from the bulk of those in the training set. To this end, the developed descriptors are quite effective in explaining the results of the CF-CNN (a.k.a. the "black box") model
650		4	\|a Journal Article
700	1		\|a Heindel, Joseph P \|e verfasserin \|4 aut
700	1		\|a Schram, Malachi \|e verfasserin \|4 aut
700	1		\|a Bandyopadhyay, Pradipta \|e verfasserin \|4 aut
700	1		\|a Xantheas, Sotiris S \|e verfasserin \|4 aut
700	1		\|a Choudhury, Sutanay \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t The Journal of chemical physics \|d 1963 \|g 153(2020), 2 vom: 14. Juli, Seite 024302 \|w (DE-627)NLM042699096 \|x 1089-7690 \|7 nnns
773	1	8	\|g volume:153 \|g year:2020 \|g number:2 \|g day:14 \|g month:07 \|g pages:024302
856	4	0	\|u http://dx.doi.org/10.1063/5.0009933 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|d 153 \|j 2020 \|e 2 \|b 14 \|c 07 \|h 024302

A look inside the black box : Using graph-theoretical descriptors to interpret a Continuous-Filter Convolutional Neural Network (CF-CNN) trained on the global and local minimum energy structures of neutral water clusters

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände