While reading texts on statistics and meta-science I kept noticing vagueness. For example, there seems to be half a dozen definitions of replicability in papers since 2016. In this text, I try to formalize the underlying structure.
Edit 2020-11-01: The model below is basically the same, but poorer, than the causal models as presented by, for example, Pearl (2009).
Assume determinism. Assume that for any function there is a set of predicates, or context, which need to hold for the function to hold, that is, return the correct answer. Let this be denoted by . For example, Bernoulli's equation solved for only holds for a context containing isentropic flows, that is, , where contains isentropic flows. There have been arguments that such contexts need to contain an (open-ended) list of negative conditions (Hoefer (2003)). Let these contexts and the contexts below also contain this list.
The goal of science is to come up with models which allow for making accurate predictions. The scientific process consists of various steps to derive these models. Let be the context space, that is, all the possible contexts a scientist can choose to experiment in. For some study let
be the context (a set of predicates) which holds for study ,
be the sample used by study ,
be the raw data obtained from the sample,
- be the wrangled data, that is, the cleaned data
be statistics on , such as the mean, and
be the reported model.
Note that ; this usually is a generalization based on intuition of the researcher. For example, a researcher omits the fact that a study on 20 patients only included patients with blue eyes, since that is not expected to affect the results. The steps are chosen such that each step can be a potential source of error.
We can depict study , with steps as
Most steps have well-known names. Step 2 is called sampling, step 3 measuring, step 4 data cleaning or wrangling, step 5 calculating statistics, and step 6 inference.
Goodman et al. (2016) introduce the following definitions for reproducibility.
methods reproducibility: "the ability to implement, as exactly as possible, the experimental and computational procedures, with the same data and tools, to obtain the same results",
results reproducibility: "the production of corroborating results in a new study, having followed the same experimental methods", and
inferential reproducibility: "the making of knowledge claims of similar strenght from a study replication or reanalysis".
These definitions can also be stated as
methods reproducibility: Obtaining the same results from having identical steps 4 and 5,
results reproducibility: Obtaining the same results from having identical steps 1, 2, 3, 4 and 5,
inferential reproducibility: Obtaining the same results from having identical steps 1, 2, 3, 4, 5 and 6.
Goodman, S. N., Fanelli, D., & Ioannidis, J. P. A. (2016). What does research reproducibility mean? Science Translational Medicine, 8(341), 341ps12-341ps12. https://doi.org/10.1126/scitranslmed.aaf5027
Hoefer, C. (2003). Causal Determinism. https://plato.stanford.edu/archives/spr2016/entries/determinism-causal/
Pearl, J. (2009). Causality. Cambridge university press.