Optimal Rates for Phylogenetic Inference and Experimental Design in the Era of Genome-Scale Data Sets
With the rise of genome-scale data sets, there has been a call for increased data scrutiny and careful selection of loci that are appropriate to use in an attempt to resolve a phylogenetic problem. Such loci should maximize phylogenetic information content while minimizing the risk of homoplasy. Theory posits the existence of characters that evolve at an optimum rate, and efforts to determine optimal rates of inference have been a cornerstone of phylogenetic experimental design for over two decades. However, both theoretical and empirical investigations of optimal rates have varied dramatically in their conclusions: spanning no relationship to a tight relationship between the rate of change and phylogenetic utility. Herein, we synthesize these apparently contradictory views, demonstrating both empirical and theoretical conditions under which each is correct. We find that optimal rates of characters-not genes-are generally robust to most experimental design decisions. Moreover, consideration of site rate heterogeneity within a given locus is critical to accurate predictions of utility. Factors such as taxon sampling or the targeted number of characters providing support for a topology are additionally critical to the predictions of phylogenetic utility based on the rate of character change. Further, optimality of rates and predictions of phylogenetic utility are not equivalent, demonstrating the need for further development of comprehensive theory of phylogenetic experimental design. [Divergence time; GC bias; homoplasy; incongruence; information content; internode length; optimal rates; phylogenetic informativeness; phylogenetic theory; phylogenetic utility; phylogenomics; signal and noise; subtending branch length; state space; taxon and character sampling.].
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2019 |
---|---|
Erschienen: |
2019 |
Enthalten in: |
Zur Gesamtaufnahme - volume:68 |
---|---|
Enthalten in: |
Systematic biology - 68(2019), 1 vom: 01. Jan., Seite 145-156 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Dornburg, Alex [VerfasserIn] |
---|
Links: |
---|
Themen: |
---|
Anmerkungen: |
Date Completed 20.12.2018 Date Revised 20.12.2018 published: Print Citation Status MEDLINE |
---|
doi: |
10.1093/sysbio/syy047 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM285829580 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM285829580 | ||
003 | DE-627 | ||
005 | 20231225050317.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231225s2019 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1093/sysbio/syy047 |2 doi | |
028 | 5 | 2 | |a pubmed24n0952.xml |
035 | |a (DE-627)NLM285829580 | ||
035 | |a (NLM)29939341 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Dornburg, Alex |e verfasserin |4 aut | |
245 | 1 | 0 | |a Optimal Rates for Phylogenetic Inference and Experimental Design in the Era of Genome-Scale Data Sets |
264 | 1 | |c 2019 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 20.12.2018 | ||
500 | |a Date Revised 20.12.2018 | ||
500 | |a published: Print | ||
500 | |a Citation Status MEDLINE | ||
520 | |a With the rise of genome-scale data sets, there has been a call for increased data scrutiny and careful selection of loci that are appropriate to use in an attempt to resolve a phylogenetic problem. Such loci should maximize phylogenetic information content while minimizing the risk of homoplasy. Theory posits the existence of characters that evolve at an optimum rate, and efforts to determine optimal rates of inference have been a cornerstone of phylogenetic experimental design for over two decades. However, both theoretical and empirical investigations of optimal rates have varied dramatically in their conclusions: spanning no relationship to a tight relationship between the rate of change and phylogenetic utility. Herein, we synthesize these apparently contradictory views, demonstrating both empirical and theoretical conditions under which each is correct. We find that optimal rates of characters-not genes-are generally robust to most experimental design decisions. Moreover, consideration of site rate heterogeneity within a given locus is critical to accurate predictions of utility. Factors such as taxon sampling or the targeted number of characters providing support for a topology are additionally critical to the predictions of phylogenetic utility based on the rate of character change. Further, optimality of rates and predictions of phylogenetic utility are not equivalent, demonstrating the need for further development of comprehensive theory of phylogenetic experimental design. [Divergence time; GC bias; homoplasy; incongruence; information content; internode length; optimal rates; phylogenetic informativeness; phylogenetic theory; phylogenetic utility; phylogenomics; signal and noise; subtending branch length; state space; taxon and character sampling.] | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, Non-U.S. Gov't | |
700 | 1 | |a Su, Zhuo |e verfasserin |4 aut | |
700 | 1 | |a Townsend, Jeffrey P |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Systematic biology |d 1996 |g 68(2019), 1 vom: 01. Jan., Seite 145-156 |w (DE-627)NLM092550762 |x 1076-836X |7 nnns |
773 | 1 | 8 | |g volume:68 |g year:2019 |g number:1 |g day:01 |g month:01 |g pages:145-156 |
856 | 4 | 0 | |u http://dx.doi.org/10.1093/sysbio/syy047 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 68 |j 2019 |e 1 |b 01 |c 01 |h 145-156 |