RRT: Services: Biostatistics Consulting Center: Indiana University

RRT Review Checklists

Second Statistician

If you would like BCC to do a "second statistician" RRT review of the analyses for your paper, please send us:

the draft manuscript (or presentation/report) with tables, figures, and text to review,
the analysis data with all final variables necessary and sufficient for analysis in the paper,
code (R, SAS, Stata, or SPSS) that produces all results included in the paper,
the statistical output that shows all results included in the paper.

The second statistician will perform the following 3-step RRT verification process:

Verify that the code Reproduces all of the exact same numeric results presented in tables and text in the manuscript or report.
Verify that the methods used in the code are Transparently described in the manuscript or report.
Verify that the methods used are correct and appropriate (Rigor) or advise on any alternatives.

The BCC statistician will provide a report listing any discrepancies in Reproducibility or Transparency as well as feedback or advice on methods (Rigor).

First Statistician

The primary analyst preparing your files to submit for RRT review, please do the following:

GOOD

Have the data in a format immediately ready to import into the software.
- If there are multiple data files, include a file directory describing what's what.
Save the syntax/code/script in it’s native software format (e.g. .sas for SAS; .sps for SPSS; .R or .txt for R; .do for Stata) so they can be easily re-run.
- Don't copy/paste into Word.
Make sure that the script runs top to bottom with One-click with no manual intervention
- Include any filters/subsets or calculations in the script.
- If you ran a model twice with different parameters or variables, that are both presented in the paper, list them separately in the code.
Check that the script is sufficient to produce ALL numeric results (text and tables) in the paper/report.
- Don't leave out Descriptive Statistics/ Table 1 because it seems less important.
- Make notes if there are any pieces that needed to be run separately in different software (and include that separately)
Do a clean run (top to bottom) of your code and save the output that shows all pieces of output for the paper/results
- This can be copy/pasted into Word if easiest
Send (or share) all sufficient data, data codebook, syntax code, output, and draft of the paper with all text/tables/figures in one email (or folder online)

BETTER

Include a codebook to provide a brief description of each variable, where not otherwise obvious, including measure units.
Use code in the script to import the data with the file path (e.g. C:/users/xxx/Projects/MyProject/) and file name (e.g. DataFile.xlsx or DataFile.csv) at the top of the script.
- This allows the reviewer to see the specific data file(s) and version being used.
- Ideally, you should only put the file path one place at the top of the code, and use this 'variable' each time it's needed for importing data or exporting output.
- This also allows the reviewer to easily revise the file path (one place) to re-run the code on their computer and quickly re-run everything.
Remove un-necessary code and results for models/tables/figures not needed for this paper/report.
Use comments in the code (and output) to indicate which section/model/table in the paper is being run.
- E.g. Add Table numbers or labels which we can use to identity each piece of output in the paper/results and find them easily in the code.
- Put the analyses in the code in order of the results text/tables to facilitate review

BEST (or atleast BCC preferences)

Use R markdown or SAS ODS to automatically save the statistical output with headers, description, and notes for collaborators/reviewer to easily see.
Use lots of comments in the code and/or output for any decisions that were made or methods in the code that may not be obvious, including paper references.
Ideally, put all the data prep at the top of the code (or in a separate piece of code/script) to create the analytic file, followed by the analytic code
- We believe this makes the code more organized and easier to read
Use code to extract the individual results from model output and put into tables (in the markdown output and/or Excel), rather than manual entry of tables
Ideally, have BOTH the "raw" model output AND the clean tables in the output for the reviewer to see where table numbers originate.

See our Correspondence in Nature July 25, 2023: https://www.nature.com/articles/d41586-023-02250-z with the Reproducibility Checklist .pdf posted here: https://osf.io/t83w2.

Biostatistics Consulting Center

RRT