dataset-report generates a report for each dataset in an input directory. The report is a self-contained HTML file that has the same name as the dataset, but with the extension html.
dataset-report's repository contains this study that extracts a (dummy) dataset and generates this report.
dataset-report applies disclosure controls to each report that it generates, meaning OpenSAFELY output checkers can be confident that a report is a safe output. Specifically, dataset-report:
- rounds counts to the nearest five; then
- redacts counts that are less than or equal to five.
The OpenSAFELY documentation has more information on disclosure controls and safe outputs.
In summary:
- Use cohort-extractor to extract one or more datasets.
- Use dataset-report to generate a report for each dataset.
Let's walk through an example project.yaml.
The following cohort-extractor action extracts a dataset:
run: >
cohortextractor:latest generate_cohort
--study-definition study_definition
cohort: output/input.csv
Finally, the following dataset-report reusable action generates a report for the dataset.
Remember to replace [version]
with a dataset-report version:
run: >
--input-files output/input.csv
--output-dir output
needs: [generate_cohort]
dataset_report: output/input.html
Please see
Because of a bug in the GitHub workflow that tagged versions, some versions of dataset-report are ordered incorrectly: a later version precedes an earlier version, when versions are sorted by the dates their associated commits were created. For example, v0.0.19 (later) precedes v0.0.17 (earlier). The bug was fixed in #111.