An unprecedented opportunity to advance understanding of the biological rules that govern the diversity and dynamics of life now exists thanks to the large quantity and variety of data becoming increasingly available. This goal of understanding biodiversity dynamics is enabled at a critical moment when human systems are disrupting those very dynamics. However, the scientific and computational tools needed to derive understanding from data are still missing. Such tools need to be accessible to a broad community of users, thereby catalyzing involvement and innovation. This project will (1) build a computational model for multiple aspects of biodiversity—species abundance, genetic, functional, and phylogenetic; (2) use and refine this model by testing major hypotheses about the generation and maintenance of biodiversity in three exemplar systems; (3) make the model accessible to the scientific community by building an open-source platform to prepare diverse data sources and run the model; and (4) create pedagogically effective courses and workshops to enable students, researchers, and stakeholders from many backgrounds to understand biodiversity theory and the data science tools needed to test those theories with data.

The Rules of Life Engine (RoLE) model will be a mechanistic, simulation-based hypothesis-testing and data synthesis framework enabling scientists with multi-dimensional biodiversity data to generate and test parameterized hypotheses about the processes driving biodiversity patterns. The RoLE model will apply new techniques in machine learning to fit models to high dimensional, cross-scale data. The model will simulate eco-evolutionary community assembly building from individual-based ecological and genetic neutral models with added non-neutral, trait-based competition and environmental filtering. New species and traits will arise through long time scale evolution in the metacommunity and rapid evolution in the local community. Population genetics and species abundances in the local community will be modeled through birth, death, immigration, and mutation. The project research team will refine and illustrate the use of the RoLE model by testing four hypothesized rules of life across three biogeographic systems for which multi-scale biodiversity data are now available. The hypotheses address the relative roles of immigration versus speciation in community assembly, how species interactions influence diversity, how different assembly histories determine the strength of species interactions, and whether/how systems come to equilibrium. The project leaders have established a network of 14 collaborators, including the National Ecological Observatory Network, who will use the RoLE model in their diverse systems and propagate wider adoption. In order to further reduce barriers to use, the RoLE model framework will be made available as open source software, including an R language Shiny App interface that creates containerized workflows. Output research objects with standardized metadata will promote reproducibility and sharing. The insights gained from the RoLE model are of direct relevance to conservation, e.g., whether or not communities are assembled primarily by in situ evolution or ex situ immigration strongly determines their response to anthropogenic pressures and optimal conservation management. To encourage participation in quantitative biodiversity research, the project leaders will develop a massively open online course through the Santa Fe Institute’s Complexity Explorer program and use the RoLE model as an interactive teaching tool. In conjunction with Data Carpentry and Software Carpentry, the research team will also provide an in-person data science training workshop. Results from this project can be found at https://role-model.github.io.