To ensure reliable maintenance of thousands of packages, we use a semi-automatic, agent-assisted development workflow (Supplementary Fig.
\ref{491724}d), orchestrated by a suite of tools we authored and maintain (
bioconda-utils,
https://github.com/bioconda/bioconda-utils). All Bioconda recipes are hosted in a GitHub repository (
https://github.com/bioconda/bioconda-recipes). Both the addition of new recipes and the update of existing recipes in Bioconda is handled via
pull requests. A contributor opens a pull request on the GitHub repository with a modified version of one or more recipes, and these changes are automatically compared against the current state of Bioconda. Once a pull request arrives, our infrastructure performs several automatic checks. Problems discovered in any step are reported to the contributor and further progress is blocked until they are resolved. First, the modified recipes are checked for syntactic anti-patterns, i.e., formulations that are syntactically correct but bad style (termed
linting). This process ensures consistency across all submitted recipes, serving as an initial quality-control step and easing the automated maintenance of recipes. Second, the modified recipes are built on Linux and macOS, via a cloud based, free-of-charge service (
https://travis-ci.org). Successfully built recipes are tested (e.g., by running the generated executable). Since Bioconda packages must be able to run on any supported system, it is important to check that the built packages do not rely on particular elements from the build environment. Therefore, testing happens in two stages: (a) test cases are executed in the full build environment and (b) test cases are executed in a minimal Docker (
https://docker.com) container which purposefully lacks all non-common system libraries. Hence, a dependency that is not explicitly defined will lead to a failure in the latter, more stringent test. Once the
build and
test steps have succeeded, a member of the Bioconda team reviews the proposed changes and, if acceptable, merges the modifications into the official repository. Upon merging, packages are uploaded to the hosted Bioconda channel (
https://anaconda.org/bioconda), where they become available via the Conda package manager. When a Bioconda package is updated to a new version, older builds are generally preserved, and recipes for multiple older versions may be maintained in the Bioconda repository.
Above process appears to scale well with the growing number of recipes and contributors (Supplementary Fig. \ref{491724}a,b). The usual turnaround time of the workflow is short (Supplementary Fig. \ref{491724}e): 61% of the pull requests are merged within 5 hours. Of those, 36% are even merged within 1 hour. Only 18% of the pull requests need more than a day. Hence, publishing software in Bioconda or updating already existing packages can be accomplished typically within minutes to a few hours.
Using Bioconda as a service to obtain packages for local installation entails trusting that (a) the provided software itself is not harmful and (b) it has not been modified in a harmful way. Ensuring (a) is up to the user. In contrast, (b) is handled by our workflow. First, source code or binary files defined in recipes are checked for integrity via MD5 or SHA256 hash values. Second, all review and testing steps are enforced via the GitHub interface. This guarantees that all packages have been tested automatically and reviewed by a human being. Third, all changes to the repository of recipes are publicly tracked, and all build and test steps are transparently visible to the user. Finally, the automatic parts of the development workflow are implemented in the open-source software bioconda-utils. In the future, we will further explore the possibility to sign packages cryptographically.