Details der Publikation - On the design of linked datasets mapping networks of collaboration in the genomic sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa

On the design of linked datasets mapping networks of collaboration in the genomic sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa

Copyright: © 2023 Wong M and Leng R..

This data note describes a unique two-step methodology to construct six linked datasets covering the sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa genomes. The datasets were used as evidence in a project that investigated the history of genomic science. To design the datasets, we first retrieved all sequence submission data from the European Nucleotide Archive (ENA), including accession numbers associated with each of our three species. Second, we used these accession numbers to construct queries to retrieve peer-reviewed scientific publications that first described these sequence submissions in the scientific literature. For each species, this resulted in two associated datasets: 1) A .csv file documenting the PMID of each article describing new sequences, all paper authors, all institutional affiliations of each author, countries of institution, year of first submission to the ENA (when available), and the year of article publication, and 2) A .csv file documenting all institutions submitting to the ENA, number of nucleotides sequenced and years of submission to the database. We utilised these datasets to understand how institutional collaboration shaped sequencing efforts, and to systematically identify important institutions and changes in the structure of research communities throughout the history of genomics and across our three target species. This data note, therefore, should aid researchers who would like to use these data for future analyses by making the methodology that underpins it transparent. Further, by detailing our methodology, researchers may be able to utilise our approach to construct similar datasets in the future.

Medienart:	E-Artikel

Erscheinungsjahr:	2019
Erschienen:	2019

Enthalten in:	Zur Gesamtaufnahme - volume:8
Enthalten in:	F1000Research - 8(2019) vom: 28., Seite 1200

Sprache:	Englisch

Beteiligte Personen:	Wong, Mark [VerfasserIn] Leng, Rhodri [VerfasserIn]

Links:	Volltext

Themen:	Bibliographic Database Bibliometrics Genomics History of science Homo sapiens Journal Article Network analysis S. cerevisiae Sus scrofa

Anmerkungen:	Date Revised 03.03.2023 published: Electronic-eCollection Citation Status PubMed-not-MEDLINE

doi:	10.12688/f1000research.18656.3

funding:
Förderinstitution / Projekttitel:

PPN (Katalog-ID):	NLM321628071

Internformat


LEADER	01000caa a22002652 4500
001	NLM321628071
003	DE-627
005	20231226060526.0
007	cr uuu---uuuuu
008	231225s2019 xx \|\|\|\|\|o 00\| \|\|eng c
024	7		\|a 10.12688/f1000research.18656.3 \|2 doi
028	5	2	\|a pubmed24n1178.xml
035			\|a (DE-627)NLM321628071
035			\|a (NLM)33604022
040			\|a DE-627 \|b ger \|c DE-627 \|e rakwb
041			\|a eng
100	1		\|a Wong, Mark \|e verfasserin \|4 aut
245	1	0	\|a On the design of linked datasets mapping networks of collaboration in the genomic sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa
264		1	\|c 2019
336			\|a Text \|b txt \|2 rdacontent
337			\|a ƒaComputermedien \|b c \|2 rdamedia
338			\|a ƒa Online-Ressource \|b cr \|2 rdacarrier
500			\|a Date Revised 03.03.2023
500			\|a published: Electronic-eCollection
500			\|a Citation Status PubMed-not-MEDLINE
520			\|a Copyright: © 2023 Wong M and Leng R.
520			\|a This data note describes a unique two-step methodology to construct six linked datasets covering the sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa genomes. The datasets were used as evidence in a project that investigated the history of genomic science. To design the datasets, we first retrieved all sequence submission data from the European Nucleotide Archive (ENA), including accession numbers associated with each of our three species. Second, we used these accession numbers to construct queries to retrieve peer-reviewed scientific publications that first described these sequence submissions in the scientific literature. For each species, this resulted in two associated datasets: 1) A .csv file documenting the PMID of each article describing new sequences, all paper authors, all institutional affiliations of each author, countries of institution, year of first submission to the ENA (when available), and the year of article publication, and 2) A .csv file documenting all institutions submitting to the ENA, number of nucleotides sequenced and years of submission to the database. We utilised these datasets to understand how institutional collaboration shaped sequencing efforts, and to systematically identify important institutions and changes in the structure of research communities throughout the history of genomics and across our three target species. This data note, therefore, should aid researchers who would like to use these data for future analyses by making the methodology that underpins it transparent. Further, by detailing our methodology, researchers may be able to utilise our approach to construct similar datasets in the future
650		4	\|a Journal Article
650		4	\|a Bibliographic Database
650		4	\|a Bibliometrics
650		4	\|a Homo sapiens
650		4	\|a S. cerevisiae
650		4	\|a Sus scrofa
650		4	\|a genomics
650		4	\|a history of science
650		4	\|a network analysis
700	1		\|a Leng, Rhodri \|e verfasserin \|4 aut
773	0	8	\|i Enthalten in \|t F1000Research \|d 2012 \|g 8(2019) vom: 28., Seite 1200 \|w (DE-627)NLM22994549X \|x 2046-1402 \|7 nnns
773	1	8	\|g volume:8 \|g year:2019 \|g day:28 \|g pages:1200
856	4	0	\|u http://dx.doi.org/10.12688/f1000research.18656.3 \|3 Volltext
912			\|a GBV_USEFLAG_A
912			\|a GBV_NLM
951			\|a AR
952			\|d 8 \|j 2019 \|b 28 \|h 1200

On the design of linked datasets mapping networks of collaboration in the genomic sequencing of Saccharomyces cerevisiae, Homo sapiens, and Sus scrofa

Zugang & Verfügbarkeit

Zugehörige Publikationen/Bände