Lost Branches

Just read Lost Branches on the Tree of Life in PLoS Biology and couldn’t agree more! I too am frequently frustrated by the lack of deposition or replies to email requests for data. The 16% reply rate seems consistent with other such studies. I agree with the general message that what is most needed is a cultural shift towards these expectations. I loved this paper overall but nonetheless cannot resist a few nitpicks.

The article does address some data quality concerns, but makes no mention of challenges with many current formats for actually sharing the data. The use of “nexus” files differs between different programs (e.g. Mr Bayes makes certain modifications to its nexus files which may render them incompatible with other programs), or contain formatting errors that make them un-readable (For example, about 15% of NEXUS files submitted to the CIPRES portal contain unrecoverable yet hard-to-diagnose file format errors). I understand that the NeXML format (see Vos et al: 10.1093/sysbio/sys025) with a formal grammar provides a way to address both these validation and extensibility problems, so it might have been nice to see it referenced.

While I agree with your recommendation that Treebase submission process be more streamlined (e.g. like Dryad), though on the other hand they didn’t make much mention of the that Treebase in turn supplies much richer metadata than Dryad. Clearly this is a trade-off and no doubt good that alternatives exist. Another minor angle that I didn’t see addressed so much in the paper is the role of publishers in even checking that the identifiers, when present, actually resolve to the data; an issue highlighted by this recent paper: https://peerj.com/articles/148/.