Feature selection methods for identifying genetic determinants of host species in RNA viruses
Despite environmental, social and ecological dependencies, emergence of zoonotic viruses in human populations is clearly also affected by genetic factors which determine cross-species transmission potential. RNA viruses pose an interesting case study given their mutation rates are orders of magnitude higher than any other pathogen--as reflected by the recent emergence of SARS and Influenza for example. Here, we show how feature selection techniques can be used to reliably classify viral sequences by host species, and to identify the crucial minority of host-specific sites in pathogen genomic data. The variability in alleles at those sites can be translated into prediction probabilities that a particular pathogen isolate is adapted to a given host. We illustrate the power of these methods by: 1) identifying the sites explaining SARS coronavirus differences between human, bat and palm civet samples; 2) showing how cross species jumps of rabies virus among bat populations can be readily identified; and 3) de novo identification of likely functional influenza host discriminant markers.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2013 |
---|---|
Erschienen: |
2013 |
Enthalten in: |
Zur Gesamtaufnahme - volume:9 |
---|---|
Enthalten in: |
PLoS computational biology - 9(2013), 10 vom: 15., Seite e1003254 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Aguas, Ricardo [VerfasserIn] |
---|
Links: |
---|
Themen: |
Journal Article |
---|
Anmerkungen: |
Date Completed 06.05.2014 Date Revised 29.01.2022 published: Print-Electronic Citation Status MEDLINE |
---|
doi: |
10.1371/journal.pcbi.1003254 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM231745028 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM231745028 | ||
003 | DE-627 | ||
005 | 20231224091444.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231224s2013 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1371/journal.pcbi.1003254 |2 doi | |
028 | 5 | 2 | |a pubmed24n0772.xml |
035 | |a (DE-627)NLM231745028 | ||
035 | |a (NLM)24130470 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Aguas, Ricardo |e verfasserin |4 aut | |
245 | 1 | 0 | |a Feature selection methods for identifying genetic determinants of host species in RNA viruses |
264 | 1 | |c 2013 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 06.05.2014 | ||
500 | |a Date Revised 29.01.2022 | ||
500 | |a published: Print-Electronic | ||
500 | |a Citation Status MEDLINE | ||
520 | |a Despite environmental, social and ecological dependencies, emergence of zoonotic viruses in human populations is clearly also affected by genetic factors which determine cross-species transmission potential. RNA viruses pose an interesting case study given their mutation rates are orders of magnitude higher than any other pathogen--as reflected by the recent emergence of SARS and Influenza for example. Here, we show how feature selection techniques can be used to reliably classify viral sequences by host species, and to identify the crucial minority of host-specific sites in pathogen genomic data. The variability in alleles at those sites can be translated into prediction probabilities that a particular pathogen isolate is adapted to a given host. We illustrate the power of these methods by: 1) identifying the sites explaining SARS coronavirus differences between human, bat and palm civet samples; 2) showing how cross species jumps of rabies virus among bat populations can be readily identified; and 3) de novo identification of likely functional influenza host discriminant markers | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, N.I.H., Extramural | |
650 | 4 | |a Research Support, Non-U.S. Gov't | |
650 | 7 | |a Receptors, Cell Surface |2 NLM | |
650 | 7 | |a Viral Proteins |2 NLM | |
700 | 1 | |a Ferguson, Neil M |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t PLoS computational biology |d 2005 |g 9(2013), 10 vom: 15., Seite e1003254 |w (DE-627)NLM15722645X |x 1553-7358 |7 nnns |
773 | 1 | 8 | |g volume:9 |g year:2013 |g number:10 |g day:15 |g pages:e1003254 |
856 | 4 | 0 | |u http://dx.doi.org/10.1371/journal.pcbi.1003254 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 9 |j 2013 |e 10 |b 15 |h e1003254 |