dataset-report generates a report for each dataset in an input directory. The report is a self-contained HTML file that has the same name as the dataset, but with the extension html.
dataset-report's repository contains this study that extracts a (dummy) dataset and generates this report.
dataset-report applies disclosure controls to each report that it generates, meaning OpenSAFELY output checkers can be confident that a report is a safe output. Specifically, dataset-report:
- rounds counts to the nearest five; then
- redacts counts that are less than or equal to five.
The OpenSAFELY documentation has more information on disclosure controls and safe outputs.
In summary:
- Use cohort-extractor to extract one or more datasets.
- Use dataset-report to generate a report for each dataset.
Let's walk through an example project.yaml.
The following cohort-extractor action extracts a dataset:
generate_cohort:
run: >
cohortextractor:latest generate_cohort
--study-definition study_definition
outputs:
highly_sensitive:
cohort: output/input.csv
Finally, the following dataset-report reusable action generates a report for the dataset.
Remember to replace [version]
with a dataset-report version:
generate_dataset_report:
run: >
dataset-report:[version]
--input-files output/input.csv
--output-dir output
needs: [generate_cohort]
outputs:
moderately_sensitive:
dataset_report: output/input.html
Please see DEVELOPERS.md.