Study on Text Retrieval Based on Pre-training and Deep Hash

Aiming at the problem of low retrieval efficiency and accuracy in text retrieval,a retrieval model based on pre-trained language model and deep hash method is proposed.Firstly,the prior knowledge of text contained in the pre-trained language model is introduced by transfer learning,and then the input is transformed into high-dimensional vector representation by feature extraction.A hash learning layer is added to the back end of the whole model to fine tune the parameters of the model by designing specific optimization objectives,so as to dynamically learn the hash function and the unique hash representation of each input in the training.Experimental results show that the retrieval accuracy of this method is at least 21.70% and 21.38% higher than that of other benchmark models in top-5 and top-10,respectively.The introduction of hash code makes the model improve the retrieval speed by 40 times under the premise of only losing 4.78% accuracy.Therefore,this method can significantly improve the retrieval accuracy and efficiency,and has a potential application prospect in the field of text retrieval..

Medienart:

E-Artikel

Erscheinungsjahr:

2021

Erschienen:

2021

Enthalten in:

Zur Gesamtaufnahme - volume:48

Enthalten in:

Jisuanji kexue - 48(2021), 11, Seite 300-306

Sprache:

Chinesisch

Beteiligte Personen:

ZOU Ao, HAO Wen-ning, JIN Da-wei, CHEN Gang, TIAN Yuan [VerfasserIn]

Links:

doi.org [kostenfrei]
doaj.org [kostenfrei]
www.jsjkx.com [kostenfrei]
Journal toc [kostenfrei]

Themen:

Computer software
Deep learning|similarity retrieval|pre-trained language model|deep hash
Technology (General)

doi:

10.11896/jsjkx.210300266

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

DOAJ075344475