Cohort Report outputs graphs of variables in a study input file.
Consider the following extract from a study's project.yaml:
actions:
generate_study_population:
run: cohortextractor:latest generate_cohort
outputs:
highly_sensitive:
cohort: output/input.csv
generate_report:
run: cohort-report:v2.1.0 output/input.csv
needs: [generate_study_population]
config:
variable_types:
age: int
sex: categorical
ethnicity: categorical
bmi: float
diabetes: binary
chronic_liver_disease: binary
imd: categorical
region: categorical
stp: categorical
rural_urban: categorical
prior_covid_date: date
output_path: output/cohort_reports_outputs
outputs:
moderately_sensitive:
reports: output/cohort_reports_outputs/descriptives_input.html
The generate_report
action generates a report that contains a table and a chart for each variable in the input file.
The table contains unsafe statistics and should be checked thoroughly before it is released.
The chart contains statistics that have been made safe.
Cells in the underlying frequency table have been redacted, if:
- They contain less than 10 units
- They contain greater than 90% of the total number of units
Notice the run
and config
properties.
The run
property passes a specific input file to a specific version of cohortreport.
In this case, the specific input file is output/input.csv and the specific version of cohortreport is v1.0.0.
The config
property passes configuration to cohortreport; for more information, see Configuration.
Notice that the HTML document is called descriptives_[the name of the specific input file, without the extension].html
.
It is saved to the output_path
(see below).
output_path
, which defaults to cohort_reports_outputs
.
Save the outputs to the given path.
If the given path does not exist, then it is created.
variable_types
- this is an optional argument that should be used if the input files contain data without a type, for example, a CSV.
cohortreport
can take in other files such as '.feather' and '.dta' which contain the type of the data in each column.
In these cases, a variable_types
config if not needed.
Please see DEVELOPERS.md.