HSMotifDiscover : identification of motifs in sequences composed of non-single-letter elements
© The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissionsoup.com..
SUMMARY: The functional sub-string(s) of a biopolymer sequence defines the specificity of its interaction with other biomolecules and is often referred to as motifs. Computational algorithms and software have been broadly developed for finding such motifs in sequences in which the individual elements are single characters, such as those in DNA and protein sequences. However, there are more complex scenarios where the motifs exist in non-single-letter contexts, e.g. preferred patterns of chemical modifications on proteins, DNAs, RNAs or polysaccharides. To search for those motifs, we describe a new method that converts the modified sequence elements to representative single-letter codes and then uses a modified Gibbs-sampling algorithm to define the position specific scoring matrix representing the motif(s). As a proof of principle, we describe the implementation and application of an R package for discovering heparan sulfate (HS) motifs in glycan sequences, which are important in regulating protein-protein interactions. This software can be valuable for analyzing high-throughput glycoprotein binding data using microarrays with HS oligosaccharides or other biological polymers.
AVAILABILITY AND IMPLEMENTATION: HSMotifDiscover is freely available as an open source R package released under an MIT license at https://github.com/bioinfoDZ/HSMotifDiscover and also available in the form of an app at https://hsmotifdiscover.shinyapps.io/HSMotifDiscover_ShinyApp/.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Medienart: |
E-Artikel |
---|
Erscheinungsjahr: |
2022 |
---|---|
Erschienen: |
2022 |
Enthalten in: |
Zur Gesamtaufnahme - volume:38 |
---|---|
Enthalten in: |
Bioinformatics (Oxford, England) - 38(2022), 16 vom: 10. Aug., Seite 4036-4038 |
Sprache: |
Englisch |
---|
Beteiligte Personen: |
Singh, Vinod Kumar [VerfasserIn] |
---|
Links: |
---|
Themen: |
9007-49-2 |
---|
Anmerkungen: |
Date Completed 14.11.2022 Date Revised 02.07.2023 published: Print Citation Status MEDLINE |
---|
doi: |
10.1093/bioinformatics/btac437 |
---|
funding: |
|
---|---|
Förderinstitution / Projekttitel: |
|
PPN (Katalog-ID): |
NLM342924699 |
---|
LEADER | 01000naa a22002652 4500 | ||
---|---|---|---|
001 | NLM342924699 | ||
003 | DE-627 | ||
005 | 20231226015232.0 | ||
007 | cr uuu---uuuuu | ||
008 | 231226s2022 xx |||||o 00| ||eng c | ||
024 | 7 | |a 10.1093/bioinformatics/btac437 |2 doi | |
028 | 5 | 2 | |a pubmed24n1143.xml |
035 | |a (DE-627)NLM342924699 | ||
035 | |a (NLM)35771633 | ||
040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
041 | |a eng | ||
100 | 1 | |a Singh, Vinod Kumar |e verfasserin |4 aut | |
245 | 1 | 0 | |a HSMotifDiscover |b identification of motifs in sequences composed of non-single-letter elements |
264 | 1 | |c 2022 | |
336 | |a Text |b txt |2 rdacontent | ||
337 | |a ƒaComputermedien |b c |2 rdamedia | ||
338 | |a ƒa Online-Ressource |b cr |2 rdacarrier | ||
500 | |a Date Completed 14.11.2022 | ||
500 | |a Date Revised 02.07.2023 | ||
500 | |a published: Print | ||
500 | |a Citation Status MEDLINE | ||
520 | |a © The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissionsoup.com. | ||
520 | |a SUMMARY: The functional sub-string(s) of a biopolymer sequence defines the specificity of its interaction with other biomolecules and is often referred to as motifs. Computational algorithms and software have been broadly developed for finding such motifs in sequences in which the individual elements are single characters, such as those in DNA and protein sequences. However, there are more complex scenarios where the motifs exist in non-single-letter contexts, e.g. preferred patterns of chemical modifications on proteins, DNAs, RNAs or polysaccharides. To search for those motifs, we describe a new method that converts the modified sequence elements to representative single-letter codes and then uses a modified Gibbs-sampling algorithm to define the position specific scoring matrix representing the motif(s). As a proof of principle, we describe the implementation and application of an R package for discovering heparan sulfate (HS) motifs in glycan sequences, which are important in regulating protein-protein interactions. This software can be valuable for analyzing high-throughput glycoprotein binding data using microarrays with HS oligosaccharides or other biological polymers | ||
520 | |a AVAILABILITY AND IMPLEMENTATION: HSMotifDiscover is freely available as an open source R package released under an MIT license at https://github.com/bioinfoDZ/HSMotifDiscover and also available in the form of an app at https://hsmotifdiscover.shinyapps.io/HSMotifDiscover_ShinyApp/ | ||
520 | |a SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online | ||
650 | 4 | |a Journal Article | |
650 | 4 | |a Research Support, N.I.H., Extramural | |
650 | 7 | |a Proteins |2 NLM | |
650 | 7 | |a DNA |2 NLM | |
650 | 7 | |a 9007-49-2 |2 NLM | |
700 | 1 | |a Misra, Rohan |e verfasserin |4 aut | |
700 | 1 | |a Almo, Steven C |e verfasserin |4 aut | |
700 | 1 | |a Steidl, Ulrich G |e verfasserin |4 aut | |
700 | 1 | |a Bülow, Hannes E |e verfasserin |4 aut | |
700 | 1 | |a Zheng, Deyou |e verfasserin |4 aut | |
773 | 0 | 8 | |i Enthalten in |t Bioinformatics (Oxford, England) |d 1998 |g 38(2022), 16 vom: 10. Aug., Seite 4036-4038 |w (DE-627)NLM094620342 |x 1367-4811 |7 nnns |
773 | 1 | 8 | |g volume:38 |g year:2022 |g number:16 |g day:10 |g month:08 |g pages:4036-4038 |
856 | 4 | 0 | |u http://dx.doi.org/10.1093/bioinformatics/btac437 |3 Volltext |
912 | |a GBV_USEFLAG_A | ||
912 | |a GBV_NLM | ||
951 | |a AR | ||
952 | |d 38 |j 2022 |e 16 |b 10 |c 08 |h 4036-4038 |