R/02-harmo_process_harmonization.R
as_dataschema_mlstr.Rd
Validates the input object as a valid DataSchema compliant with formats used in Maelstrom Research ecosystem, including Opal, and returns it with the appropriate harmonizR::class attribute. This function mainly helps validate input within other functions of the package but could be used to check if an object is valid for use in a function.
as_dataschema_mlstr(object)
A potential Maelstrom formatted Dataschema to be coerced.
A list of tibble(s), each of them identifying the DataSchema.
A DataSchema defines the harmonized variables to be generated, representing meta data of an associated harmonized dossier. It must be a list of data frame like objects with elements named 'Variables' (required) and 'Categories' (if any). The 'Variables' element must contain at least the 'name' column, and the 'Categories' element must contain at least the 'variable' and 'name' columns to be usable in any function. To be considered as a minimum workable DataSchema, in 'Variables' the 'name' column must also have unique and non-null entries, and in 'Categories' the combination of 'variable' and 'name' columns must also be unique.
{
# You can use our demonstration files to run examples
as_dataschema_mlstr(DEMO_files_harmo$`dataschema - final`)
}
#> $Variables
#> # A tibble: 13 × 8
#> name `label:en` valueType index unit `Mlstr_area::1` `Mlstr_area::1.term`
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 adm_un… Unique id… text 1 NA ADM Identifiers
#> 2 adm_st… Indicator… text 2 NA ADM Questionnaire_inter…
#> 3 adm_ye… Indicator… text 3 NA ADM Questionnaire_inter…
#> 4 sdc_age Participa… integer 4 years SDC Age
#> 5 sdc_ge… Gender of… integer 5 NA SDC Sex
#> 6 phy_he… participa… decimal 6 cm PME Anthropo_measures
#> 7 phy_we… participa… decimal 7 kg PME Anthropo_measures
#> 8 phy_bmi participa… decimal 8 kg/m PME Anthropo_measures
#> 9 rep_pr… whether t… integer 9 NA REP Pregnancy_delivery
#> 10 rep_pr… whether t… integer 10 NA REP Pregnancy_delivery
#> 11 lsb_sm… whether t… integer 11 NA LSB Tobacco
#> 12 lsb_sm… whether t… integer 12 NA LSB Tobacco
#> 13 lsb_sm… participa… integer 13 NA LSB Tobacco
#> # ℹ 1 more variable: `Mlstr_area::1.scale` <chr>
#>
#> $Categories
#> # A tibble: 13 × 4
#> variable name `label:en` missing
#> <chr> <chr> <chr> <lgl>
#> 1 sdc_gender 1 Male FALSE
#> 2 sdc_gender 2 Female FALSE
#> 3 rep_preg_ever 0 never pregnant FALSE
#> 4 rep_preg_ever 1 pregnant once or more FALSE
#> 5 rep_preg_curr 0 currently pregnant FALSE
#> 6 rep_preg_curr 1 not currently pregnant FALSE
#> 7 lsb_smo_ever 0 never smoked FALSE
#> 8 lsb_smo_ever 1 smoked one pack of cigarette or more FALSE
#> 9 lsb_smo_curr 0 currently smoker FALSE
#> 10 lsb_smo_curr 1 not currently smoker FALSE
#> 11 lsb_smo_status 0 never smoker FALSE
#> 12 lsb_smo_status 1 former smoker FALSE
#> 13 lsb_smo_status 2 current smoker FALSE
#>
#> attr(,"madshapR::class")
#> [1] "data_dict_mlstr"
#> attr(,"harmonizR::class")
#> [1] "Dataschema_mlstr"