Nurturing reliable and robust open-source scientific software

Leonardo Uieda, Paul Wessel (2017)

About

I was invited to this panel session on "Open-Source Software in the Geosciences" along with some very famous people: Kerry Key, Brian Savage, Gary D Egbert, Colin Andrew Zelt, and Lion Krischer. Many thanks to the chairs Anna Kelbert and Louise Pellerin for putting this together and to Lindsey Heagy for the invitation.

Abstract

Scientific results are increasingly the product of software. The reproducibility and validity of published results cannot be ensured without access to the source code of the software used to produce them. Therefore, the code itself is a fundamental part of the methodology and must be published along with the results. With such a reliance on software, it is troubling that most scientists do not receive formal training in software development. Tools such as version control, continuous integration, and automated testing are routinely used in industry to ensure the correctness and robustness of software. However, many scientist do not even know of their existence (although efforts like Software Carpentry are having an impact on this issue). Publishing the source code is only the first step in creating an open-source project. For a project to grow it must provide documentation, participation guidelines, and a welcoming environment for new contributors. Expanding the project community is often more challenging than the technical aspects of software development. Maintainers must invest time to enforce the rules of the project and to onboard new members, which can be difficult to justify in the context of the “publish or perish” mentality. This problem will continue as long as software contributions are not recognized as valid scholarship by hiring and tenure committees. Furthermore, there are still unsolved problems in providing attribution for software contributions. Many journals and metrics of academic productivity do not recognize citations to sources other than traditional publications. Thus, some authors choose to publish an article about the software and use it as a citation marker. One issue with this approach is that updating the reference to include new contributors involves writing and publishing a new article. A better approach would be to cite a permanent archive of individual versions of the source code in services such as Zenodo. However, citations to these sources are not always recognized when computing citation metrics. In summary, the widespread development of reliable and robust open-source software relies on the creation of formal training programs in software development best practices and the recognition of software as a valid form of scholarship.


The thumbnail image is based on the Open Source Initiative logo (licensed CC-BY).

Info