Validates the input object as a valid DataSchema and coerces it with the appropriate harmonizR::class attribute. This function mainly helps validate input within other functions of the package but could be used to check if an object is valid for use in a function.

as_dataschema(object, as_dataschema_mlstr = FALSE)

Arguments

object

A potential Dataschema (list of tibble) to be coerced.

as_dataschema_mlstr

Whether the output DataSchema should have a minimal DataSchema structure or additional attributes associated with additional capabilities for Maelstrom and integrated workflows, such as Opal environments. FALSE by default.

Value

A list of tibble(s), 'Variables' and 'Categories' (if any), each of them being the two elements of the DataSchema.

Details

A DataSchema defines the harmonized variables to be generated, representing meta data of an associated harmonized dossier. It must be a list of data frame like objects with elements named 'Variables' (required) and 'Categories' (if any). The 'Variables' element must contain at least the 'name' column, and the 'Categories' element must contain at least the 'variable' and 'name' columns to be usable in any function. To be considered as a minimum workable DataSchema, in 'Variables' the 'name' column must also have unique and non-null entries, and in 'Categories' the combination of 'variable' and 'name' columns must also be unique.

Examples

{

# You can use our demonstration files to run examples

as_dataschema(DEMO_files_harmo$`dataschema - final`)

}
#> $Variables
#> # A tibble: 13 × 9
#>    name           typeof    index `label:en`     valueType unit  `Mlstr_area::1`
#>    <chr>          <chr>     <chr> <chr>          <chr>     <chr> <chr>          
#>  1 adm_unique_id  character 1     Unique identi… text      NA    ADM            
#>  2 adm_study      character 2     Indicator of … text      NA    ADM            
#>  3 adm_year_dce   character 3     Indicator of … text      NA    ADM            
#>  4 sdc_age        integer   4     Participant's… integer   years SDC            
#>  5 sdc_gender     integer   5     Gender of the… integer   NA    SDC            
#>  6 phy_height     double    6     participant's… decimal   cm    PME            
#>  7 phy_weight     double    7     participant's… decimal   kg    PME            
#>  8 phy_bmi        double    8     participant's… decimal   kg/m  PME            
#>  9 rep_preg_ever  integer   9     whether the p… integer   NA    REP            
#> 10 rep_preg_curr  integer   10    whether the p… integer   NA    REP            
#> 11 lsb_smo_ever   integer   11    whether the p… integer   NA    LSB            
#> 12 lsb_smo_curr   integer   12    whether the p… integer   NA    LSB            
#> 13 lsb_smo_status integer   13    participant s… integer   NA    LSB            
#> # ℹ 2 more variables: `Mlstr_area::1.term` <chr>, `Mlstr_area::1.scale` <chr>
#> 
#> $Categories
#> # A tibble: 13 × 5
#>    variable       name  labels `label:en`                           missing
#>    <chr>          <chr> <chr>  <chr>                                <chr>  
#>  1 sdc_gender     1     1      Male                                 0      
#>  2 sdc_gender     2     2      Female                               0      
#>  3 rep_preg_ever  0     0      never pregnant                       0      
#>  4 rep_preg_ever  1     1      pregnant once or more                0      
#>  5 rep_preg_curr  0     0      currently pregnant                   0      
#>  6 rep_preg_curr  1     1      not currently pregnant               0      
#>  7 lsb_smo_ever   0     0      never smoked                         0      
#>  8 lsb_smo_ever   1     1      smoked one pack of cigarette or more 0      
#>  9 lsb_smo_curr   0     0      currently smoker                     0      
#> 10 lsb_smo_curr   1     1      not currently smoker                 0      
#> 11 lsb_smo_status 0     0      never smoker                         0      
#> 12 lsb_smo_status 1     1      former smoker                        0      
#> 13 lsb_smo_status 2     2      current smoker                       0      
#> 
#> attr(,"madshapR::class")
#> [1] "data_dict"
#> attr(,"harmonizR::class")
#> [1] "Dataschema"