SCMP-IL: an incremental learning method with super constraints on model parameters
Abstract Deep learning technology has played an important role in our life. Since deep learning technology relies on the neural network model, it is still plagued by the catastrophic forgetting problem, which refers to the neural network model will forget what it has learned after learning new knowledge. The neural network model learns knowledge through labeled samples, and its knowledge is stored in its parameters. Therefore, many methods try to solve this problem from the perspective of constraint parameters and stored samples. There are few ways to solve this problem from the perspective of constraining features output of neural network models. This paper proposes an incremental learning method with super constraints on model parameters. This method not only calculates the parameter similarity loss of the old and new models, but also calculates the layer output feature similarity loss of the old and new models, and finally suppresses the change of model parameters from two directions. In addition, we also propose a new strategy for selecting representative samples from dataset and tackling the imbalance between stored samples and new task samples. Finally, we utilize the neural kernel mapping support vector machine theory to increase the interpretability of the model. In order to better meet the actual situation, five sample sets with different categories and amounts were employed in experiments. Experiments show the effectiveness of our method. For example, after learning the last task, our method is at least 1.930% and 0.562% higher than other methods on the training set and test set, respectively..
Medienart: |
Artikel |
---|
Erscheinungsjahr: |
2022 |
---|---|
Erschienen: |
2022 |
Enthalten in: |
Zur Gesamtaufnahme - volume:14 |
---|---|
Enthalten in: |
International journal of machine learning and cybernetics - 14(2022), 5 vom: 27. Nov., Seite 1751-1767 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Han, Jidong [VerfasserIn] |
---|
Links: |
Volltext [lizenzpflichtig] |
---|
BKL: | |
---|---|
Themen: |
Catastrophic forgetting |
Anmerkungen: |
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. |
---|
doi: |
10.1007/s13042-022-01725-1 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
OLC2134564377 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | OLC2134564377 | ||
003 | DE-627 | ||
005 | 20240327041129.0 | ||
007 | tu | ||
008 | 230510s2022 xx ||||| 00| ||eng c | ||
024 | 7 | |a 10.1007/s13042-022-01725-1 |2 doi | |
035 | |a (DE-627)OLC2134564377 | ||
035 | |a (DE-He213)s13042-022-01725-1-p | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
082 | 0 | 4 | |a 004 |a 000 |a 570 |q VZ |
082 | 0 | 4 | |a 000 |q VZ |
084 | |a 54.72 |2 bkl | ||
100 | 1 | |a Han, Jidong |e verfasserin |0 (orcid)0000-0002-4945-2150 |4 aut | |
245 | 1 | 0 | |a SCMP-IL: an incremental learning method with super constraints on model parameters |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ohne Hilfsmittel zu benutzen |b n |2 rdamedia | ||
338 | |a Band |b nc |2 rdacarrier | ||
500 | |a © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. | ||
520 | |a Abstract Deep learning technology has played an important role in our life. Since deep learning technology relies on the neural network model, it is still plagued by the catastrophic forgetting problem, which refers to the neural network model will forget what it has learned after learning new knowledge. The neural network model learns knowledge through labeled samples, and its knowledge is stored in its parameters. Therefore, many methods try to solve this problem from the perspective of constraint parameters and stored samples. There are few ways to solve this problem from the perspective of constraining features output of neural network models. This paper proposes an incremental learning method with super constraints on model parameters. This method not only calculates the parameter similarity loss of the old and new models, but also calculates the layer output feature similarity loss of the old and new models, and finally suppresses the change of model parameters from two directions. In addition, we also propose a new strategy for selecting representative samples from dataset and tackling the imbalance between stored samples and new task samples. Finally, we utilize the neural kernel mapping support vector machine theory to increase the interpretability of the model. In order to better meet the actual situation, five sample sets with different categories and amounts were employed in experiments. Experiments show the effectiveness of our method. For example, after learning the last task, our method is at least 1.930% and 0.562% higher than other methods on the training set and test set, respectively. | ||
650 | 4 | |a Incremental learning | |
650 | 4 | |a Catastrophic forgetting | |
650 | 4 | |a Parameter similarity loss | |
650 | 4 | |a Layer output feature similarity loss | |
650 | 4 | |a Neural Kernel mapping support vector machine | |
700 | 1 | |a Liu, Zhaoying |4 aut | |
700 | 1 | |a Li, Yujian |4 aut | |
700 | 1 | |a Zhang, Ting |4 aut | |
773 | 0 | 8 | |i Enthalten in |t International journal of machine learning and cybernetics |d Springer Berlin Heidelberg, 2010 |g 14(2022), 5 vom: 27. Nov., Seite 1751-1767 |w (DE-627)605215553 |w (DE-600)2505735-2 |w (DE-576)341091618 |x 1868-8071 |7 nnns |
773 | 1 | 8 | |g volume:14 |g year:2022 |g number:5 |g day:27 |g month:11 |g pages:1751-1767 |
856 | 4 | 1 | |u https://doi.org/10.1007/s13042-022-01725-1 |z lizenzpflichtig |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a SYSFLAG_A | ||
912 | |a GBV_OLC | ||
912 | |a GBV_ILN_267 | ||
912 | |a GBV_ILN_2018 | ||
936 | b | k | |a 54.72 |j Künstliche Intelligenz |j Künstliche Intelligenz |q VZ |
951 | |a AR | ||
952 | |d 14 |j 2022 |e 5 |b 27 |c 11 |h 1751-1767 |