REscan : inferring repeat expansions and structural variation in paired-end short read sequencing data

© The Author(s) 2020. Published by Oxford University Press..

MOTIVATION: Repeat expansions are an important class of genetic variation in neurological diseases. However, the identification of novel repeat expansions using conventional sequencing methods is a challenge due to their typical lengths relative to short sequence reads and difficulty in producing accurate and unique alignments for repetitive sequence. However, this latter property can be harnessed in paired-end sequencing data to infer the possible locations of repeat expansions and other structural variation.

RESULTS: This article presents REscan, a command-line utility that infers repeat expansion loci from paired-end short read sequencing data by reporting the proportion of reads orientated towards a locus that do not have an adequately mapped mate. A high REscan statistic relative to a population of data suggests a repeat expansion locus for experimental follow-up. This approach is validated using genome sequence data for 259 cases of amyotrophic lateral sclerosis, of which 24 are positive for a large repeat expansion in C9orf72, showing that REscan statistics readily discriminate repeat expansion carriers from non-carriers.

AVAILABILITYAND IMPLEMENTATION: C source code at https://github.com/rlmcl/rescan (GNU General Public Licence v3).

Medienart:

E-Artikel

Erscheinungsjahr:

2021

Erschienen:

2021

Enthalten in:

Zur Gesamtaufnahme - volume:37

Enthalten in:

Bioinformatics (Oxford, England) - 37(2021), 6 vom: 05. Mai, Seite 871-872

Sprache:

Englisch

Beteiligte Personen:

McLaughlin, Russell Lewis [VerfasserIn]

Links:

Volltext

Themen:

C9orf72 Protein
Journal Article
Research Support, Non-U.S. Gov't

Anmerkungen:

Date Completed 03.06.2021

Date Revised 19.05.2023

published: Print

Citation Status MEDLINE

doi:

10.1093/bioinformatics/btaa753

funding:

Förderinstitution / Projekttitel:

PPN (Katalog-ID):

NLM314185992