Seeing through noise in power laws
Despite widespread claims of power laws across the natural and social sciences, evidence in data is often equivocal. Modern data and statistical methods reject even classic power laws such as Pareto's law of wealth and the Gutenberg-Richter law for earthquake magnitudes. We show that the maximum-likelihood estimators and Kolmogorov-Smirnov (K-S) statistics in widespread use are unexpectedly sensitive to ubiquitous errors in data such as measurement noise, quantization noise, heaping and censorship of small values. This sensitivity causes spurious rejection of power laws and biases parameter estimates even in arbitrarily large samples, which explains inconsistencies between theory and data. We show that logarithmic binning by powers of λ > 1 attenuates these errors in a manner analogous to noise averaging in normal statistics and that λ thereby tunes a trade-off between accuracy and precision in estimation. Binning also removes potentially misleading within-scale information while preserving information about the shape of a distribution over powers of λ, and we show that some amount of binning can improve sensitivity and specificity of K-S tests without any cost, while more extreme binning tunes a trade-off between sensitivity and specificity. We therefore advocate logarithmic binning as a simple essential step in power-law inference.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2023 |
---|---|
Erschienen: |
2023 |
Enthalten in: |
Zur Gesamtaufnahme - volume:20 |
---|---|
Enthalten in: |
Journal of the Royal Society, Interface - 20(2023), 205 vom: 29. Aug., Seite 20230310 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Lin, Qianying [VerfasserIn] |
---|
Links: |
---|
Themen: |
Extreme value |
---|
Anmerkungen: |
Date Completed 31.08.2023 Date Revised 29.12.2023 published: Print-Electronic figshare: 10.6084/m9.figshare. c.6781089 Citation Status PubMed-not-MEDLINE |
---|
doi: |
10.1098/rsif.2023.0310 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM361415508 |
---|
LEADER | 01000caa a22002652 4500 | ||
---|---|---|---|
001 | NLM361415508 | ||
003 | DE-627 | ||
005 | 20240108135915.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2023 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1098/rsif.2023.0310 |2 doi | |
028 | 5 | 2 | |a pubmed24n1242.xml |
035 | |a (DE-627)NLM361415508 | ||
035 | |a (NLM)37643642 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Lin, Qianying |e verfasserin |4 aut | |
245 | 1 | 0 | |a Seeing through noise in power laws |
264 | 1 | |c 2023 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 31.08.2023 | ||
500 | |a Date Revised 29.12.2023 | ||
500 | |a published: Print-Electronic | ||
500 | |a figshare: 10.6084/m9.figshare. c.6781089 | ||
500 | |a Citation Status PubMed-not-MEDLINE | ||
520 | |a Despite widespread claims of power laws across the natural and social sciences, evidence in data is often equivocal. Modern data and statistical methods reject even classic power laws such as Pareto's law of wealth and the Gutenberg-Richter law for earthquake magnitudes. We show that the maximum-likelihood estimators and Kolmogorov-Smirnov (K-S) statistics in widespread use are unexpectedly sensitive to ubiquitous errors in data such as measurement noise, quantization noise, heaping and censorship of small values. This sensitivity causes spurious rejection of power laws and biases parameter estimates even in arbitrarily large samples, which explains inconsistencies between theory and data. We show that logarithmic binning by powers of λ > 1 attenuates these errors in a manner analogous to noise averaging in normal statistics and that λ thereby tunes a trade-off between accuracy and precision in estimation. Binning also removes potentially misleading within-scale information while preserving information about the shape of a distribution over powers of λ, and we show that some amount of binning can improve sensitivity and specificity of K-S tests without any cost, while more extreme binning tunes a trade-off between sensitivity and specificity. We therefore advocate logarithmic binning as a simple essential step in power-law inference | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, Non-U.S. Gov't | |
650 | 4 | |a Pareto distribution | |
650 | 4 | |a extreme value | |
650 | 4 | |a fat tail | |
650 | 4 | |a scale-free | |
650 | 4 | |a self-similarity | |
650 | 4 | |a tail index | |
700 | 1 | |a Newberry, Mitchell |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Journal of the Royal Society, Interface |d 2004 |g 20(2023), 205 vom: 29. Aug., Seite 20230310 |w (DE-627)NLM164211012 |x 1742-5662 |7 nnns |
773 | 1 | 8 | |g volume:20 |g year:2023 |g number:205 |g day:29 |g month:08 |g pages:20230310 |
856 | 4 | 0 | |u http://dx.doi.org/10.1098/rsif.2023.0310 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 20 |j 2023 |e 205 |b 29 |c 08 |h 20230310 |