SPLASH : A statistical, reference-free genomic algorithm unifies biological discovery
Copyright © 2023 The Authors. Published by Elsevier Inc. All rights reserved..
Today's genomics workflows typically require alignment to a reference sequence, which limits discovery. We introduce a unifying paradigm, SPLASH (Statistically Primary aLignment Agnostic Sequence Homing), which directly analyzes raw sequencing data, using a statistical test to detect a signature of regulation: sample-specific sequence variation. SPLASH detects many types of variation and can be efficiently run at scale. We show that SPLASH identifies complex mutation patterns in SARS-CoV-2, discovers regulated RNA isoforms at the single-cell level, detects the vast sequence diversity of adaptive immune receptors, and uncovers biology in non-model organisms undocumented in their reference genomes: geographic and seasonal variation and diatom association in eelgrass, an oceanic plant impacted by climate change, and tissue-specific transcripts in octopus. SPLASH is a unifying approach to genomic analysis that enables expansive discovery without metadata or references.
Errataetall: | |
---|---|
Medienart: |
E-Artikel |
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:186 |
---|---|
Enthalten in: |
Cell - 186(2023), 25 vom: 07. Dez., Seite 5440-5456.e26 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Chaung, Kaitlin [VerfasserIn] |
---|
Links: |
---|
Themen: |
Computational biology |
---|
Anmerkungen: |
Date Completed 22.12.2023 Date Revised 14.02.2024 published: Print UpdateOf: bioRxiv. 2023 Jul 31;:. - PMID 35794890 Citation Status MEDLINE |
---|
doi: |
10.1016/j.cell.2023.10.028 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM365559318 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM365559318 | ||
003 | DE-627 | ||
005 | 20240214232959.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1016/j.cell.2023.10.028 |2 doi | |
028 | 5 | 2 | |a pubmed24n1293.xml |
035 | |a (DE-627)NLM365559318 | ||
035 | |a (NLM)38065078 | ||
035 | |a (PII)S0092-8674(23)01179-0 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Chaung, Kaitlin |e verfasserin |4 aut | |
245 | 1 | 0 | |a SPLASH |b A statistical, reference-free genomic algorithm unifies biological discovery |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 22.12.2023 | ||
500 | |a Date Revised 14.02.2024 | ||
500 | |a published: Print | ||
500 | |a UpdateOf: bioRxiv. 2023 Jul 31;:. - PMID 35794890 | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Copyright © 2023 The Authors. Published by Elsevier Inc. All rights reserved. | ||
520 | |a Today's genomics workflows typically require alignment to a reference sequence, which limits discovery. We introduce a unifying paradigm, SPLASH (Statistically Primary aLignment Agnostic Sequence Homing), which directly analyzes raw sequencing data, using a statistical test to detect a signature of regulation: sample-specific sequence variation. SPLASH detects many types of variation and can be efficiently run at scale. We show that SPLASH identifies complex mutation patterns in SARS-CoV-2, discovers regulated RNA isoforms at the single-cell level, detects the vast sequence diversity of adaptive immune receptors, and uncovers biology in non-model organisms undocumented in their reference genomes: geographic and seasonal variation and diatom association in eelgrass, an oceanic plant impacted by climate change, and tissue-specific transcripts in octopus. SPLASH is a unifying approach to genomic analysis that enables expansive discovery without metadata or references | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a RNA-seq | |
650 | 4 | |a computational biology | |
650 | 4 | |a genetics | |
650 | 4 | |a genomics | |
650 | 4 | |a reference-free | |
650 | 4 | |a single-cell RNA-seq | |
650 | 4 | |a splicing | |
650 | 4 | |a statistics | |
650 | 7 | |a HLA Antigens |2 NLM | |
700 | 1 | |a Baharav, Tavor Z |e verfasserin |4 aut | |
700 | 1 | |a Henderson, George |e verfasserin |4 aut | |
700 | 1 | |a Zheludev, Ivan N |e verfasserin |4 aut | |
700 | 1 | |a Wang, Peter L |e verfasserin |4 aut | |
700 | 1 | |a Salzman, Julia |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Cell |d 1974 |g 186(2023), 25 vom: 07. Dez., Seite 5440-5456.e26 |w (DE-627)NLM000088935 |x 1097-4172 |7 nnns |
773 | 1 | 8 | |g volume:186 |g year:2023 |g number:25 |g day:07 |g month:12 |g pages:5440-5456.e26 |
856 | 4 | 0 | |u http://dx.doi.org/10.1016/j.cell.2023.10.028 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 186 |j 2023 |e 25 |b 07 |c 12 |h 5440-5456.e26 |