Safetab outputs two-way tables of descriptive statistics.
It applies conservative small number suppression; if a cell is redacted, then the table is also redacted. Safetab creates a log file that shows what has been redacted.
Consider the following extract from a study's project.yaml:
actions:
generate_study_population:
run: cohortextractor:latest generate_cohort
outputs:
highly_sensitive:
cohort: output/input.csv
generate_safetabs:
run: safetab:v3.0.2 output/input.csv
needs: [generate_study_population]
config:
output_path: output
redaction_limit: 5
tables:
two_way:
tab_type: 2-way
variables:
- sex
- age_band
outputs:
moderately_sensitive:
safetab_log: output/input_tables/table_log.txt
safetab_two_way: output/input_tables/two_way/*.md
The generate_safetabs
action outputs one table: sex
vs age_band
.
Notice the run
and config
properties.
The run
property passes a specific input table to a specific version of safetab.
In this case, the specific input table is output/input.csv and the specific version of safetab is v3.0.2.
The config
property passes configuration to safetab; for more information, see Configuration.
output_path
, which defaults to safetab_tables
.
Save the outputs to the given path.
If the given path does not exist, then it is created.
redaction_limit
, which defaults to 5
.
Cells (and tables) with less than or equal to this number of records are redacted.
tables
2-way
: Outputs a two-way table for each pair (combination) of variables.
For example, given the variables sex
and age_band
, it will output one table:
sex
vsage_band
Given the variables sex
, age_band
, and has_copd
, it will output three tables:
sex
vsage_band
sex
vshas_copd
age_band
vshas_copd
target-2-way
: Outputs a two-way table for each pair (combination) of variables for a target variable, which is given by the target
property.
For example, given the variables sex
and age_band
, and the target variable has_copd
, it will output two tables:
sex
vshas_copd
age_band
vshas_copd
groupby-2-way
: Outputs a two-way table for each pair (combination) of variables for a group-by variable, which is given by the groupby
property.
For example, given the variables sex
and has_copd
, and the group-by variable age_band
, it will output one table for each unique value of age_band
:
sex
vshas_copd
for16-29
sex
vshas_copd
for30-39
- ...
Please see DEVELOPERS.md.