Hide metadata

dc.date.accessioned2024-04-10T15:17:20Z
dc.date.available2024-04-10T15:17:20Z
dc.date.created2023-08-08T16:03:27Z
dc.date.issued2023
dc.identifier.citationFleming, James Frederick Eriksen, Pia Merete Struck, Torsten H . Scoutknife: A naïve, whole genome informed phylogenetic robusticity metric. F1000 Research. 2023, 12(945)
dc.identifier.urihttp://hdl.handle.net/10852/110538
dc.description.abstractBackground: The phylogenetic bootstrap, first proposed by Felsenstein in 1985, is a critically important statistical method in assessing the robusticity of phylogenetic datasets. Core to its concept was the use of pseudo sampling - assessing the data by generating new replicates derived from the initial dataset that was used to generate the phylogeny. In this way, phylogenetic support metrics could overcome the lack of perfect, infinite data. With infinite data, however, it is possible to sample smaller replicates directly from the data to obtain both the phylogeny and its statistical robusticity in the same analysis. Due to the growth of whole genome sequencing, the depth and breadth of our datasets have greatly expanded and are set to only expand further. With genome-scale datasets comprising thousands of genes, we can now obtain a proxy for infinite data. Accordingly, we can potentially abandon the notion of pseudo sampling and instead randomly sample small subsets of genes from the thousands of genes in our analyses. Methods: We introduce Scoutknife, a jackknife-style subsampling implementation that generates 100 datasets by randomly sampling a small number of genes from an initial large-gene dataset to jointly establish both a phylogenetic hypothesis and assess its robusticity. We assess its effectiveness by using 18 previously published datasets and 100 simulation studies. Results: We show that Scoutknife is conservative and informative as to conflicts and incongruence across the whole genome, without the need for subsampling based on traditional model selection criteria. Conclusions: Scoutknife reliably achieves comparable results to selecting the best genes on both real and simulation datasets, while being resistant to the potential biases caused by selecting for model fit. As the amount of genome data grows, it becomes an even more exciting option to assess the robusticity of phylogenetic hypotheses.
dc.languageEN
dc.publisherF1000Research
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.titleScoutknife: A naïve, whole genome informed phylogenetic robusticity metric
dc.title.alternativeENEngelskEnglishScoutknife: A naïve, whole genome informed phylogenetic robusticity metric
dc.typeJournal article
dc.creator.authorFleming, James Frederick
dc.creator.authorEriksen, Pia Merete
dc.creator.authorStruck, Torsten H
cristin.unitcode185,28,8,8
cristin.unitnameForskningsgruppe i evolusjonær zoologi
cristin.ispublishedtrue
cristin.fulltextpreprint
cristin.qualitycode1
dc.identifier.cristin2165714
dc.identifier.bibliographiccitationinfo:ofi/fmt:kev:mtx:ctx&ctx_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.jtitle=F1000 Research&rft.volume=12&rft.spage=&rft.date=2023
dc.identifier.jtitleF1000 Research
dc.identifier.volume12
dc.identifier.issue945
dc.identifier.doihttps://doi.org/10.12688/f1000research.139356.1
dc.type.documentTidsskriftartikkel
dc.type.peerreviewedPeer reviewed
dc.source.issn2046-1402
dc.type.versionPublishedVersion
cristin.articleid945
dc.relation.projectSIGMA2/NS9408K
dc.relation.projectSIGMA2/NN9408K
dc.relation.projectNFR/300587


Files in this item

Appears in the following Collection

Hide metadata

Attribution 4.0 International
This item's license is: Attribution 4.0 International