Applications of Community Detection Algorithms to Large Biological Datasets
Recent advances in data acquiring technologies in biology have led to major challenges in mining relevant information from large datasets. For example, single-cell RNA sequencing technologies are producing expression and sequence information from tens of thousands of cells in every single experiment. A common task in analyzing biological data is to cluster samples or features (e.g., genes) into groups sharing common characteristics. This is an NP-hard problem for which numerous heuristic algorithms have been developed. However, in many cases, the clusters created by these algorithms do not reflect biological reality. To overcome this, a Networks Based Clustering (NBC) approach was recently proposed, by which the samples or genes in the dataset are first mapped to a network and then community detection (CD) algorithms are used to identify clusters of nodes.Here, we created an open and flexible python-based toolkit for NBC that enables easy and accessible network construction and community detection. We then tested the applicability of NBC for identifying clusters of cells or genes from previously published large-scale single-cell and bulk RNA-seq datasets.We show that NBC can be used to accurately and efficiently analyze large-scale datasets of RNA sequencing experiments.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2021 |
---|---|
Erschienen: |
2021 |
Enthalten in: |
Zur Gesamtaufnahme - volume:2243 |
---|---|
Enthalten in: |
Methods in molecular biology (Clifton, N.J.) - 2243(2021) vom: 19., Seite 59-80 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Kanter, Itamar [VerfasserIn] |
---|
Links: |
---|
Themen: |
Big data |
---|
Anmerkungen: |
Date Completed 02.04.2021 Date Revised 02.04.2021 published: Print Citation Status MEDLINE |
---|
doi: |
10.1007/978-1-0716-1103-6_3 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM321649982 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM321649982 | ||
003 | DE-627 | ||
005 | 20231225180327.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231225s2021 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1007/978-1-0716-1103-6_3 |2 doi | |
028 | 5 | 2 | |a pubmed24n1072.xml |
035 | |a (DE-627)NLM321649982 | ||
035 | |a (NLM)33606252 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Kanter, Itamar |e verfasserin |4 aut | |
245 | 1 | 0 | |a Applications of Community Detection Algorithms to Large Biological Datasets |
264 | 1 | |c 2021 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 02.04.2021 | ||
500 | |a Date Revised 02.04.2021 | ||
500 | |a published: Print | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Recent advances in data acquiring technologies in biology have led to major challenges in mining relevant information from large datasets. For example, single-cell RNA sequencing technologies are producing expression and sequence information from tens of thousands of cells in every single experiment. A common task in analyzing biological data is to cluster samples or features (e.g., genes) into groups sharing common characteristics. This is an NP-hard problem for which numerous heuristic algorithms have been developed. However, in many cases, the clusters created by these algorithms do not reflect biological reality. To overcome this, a Networks Based Clustering (NBC) approach was recently proposed, by which the samples or genes in the dataset are first mapped to a network and then community detection (CD) algorithms are used to identify clusters of nodes.Here, we created an open and flexible python-based toolkit for NBC that enables easy and accessible network construction and community detection. We then tested the applicability of NBC for identifying clusters of cells or genes from previously published large-scale single-cell and bulk RNA-seq datasets.We show that NBC can be used to accurately and efficiently analyze large-scale datasets of RNA sequencing experiments | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Big data | |
650 | 4 | |a Community detection | |
650 | 4 | |a Networks based clustering | |
650 | 4 | |a Single-cell RNA sequencing | |
700 | 1 | |a Yaari, Gur |e verfasserin |4 aut | |
700 | 1 | |a Kalisky, Tomer |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Methods in molecular biology (Clifton, N.J.) |d 1984 |g 2243(2021) vom: 19., Seite 59-80 |w (DE-627)NLM074849794 |x 1940-6029 |7 nnns |
773 | 1 | 8 | |g volume:2243 |g year:2021 |g day:19 |g pages:59-80 |
856 | 4 | 0 | |u http://dx.doi.org/10.1007/978-1-0716-1103-6_3 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 2243 |j 2021 |b 19 |h 59-80 |