Extract and create the DataSchema from a data processing elements

Creates the DataSchema in the Maelstrom Research formats (with 'Variables' and 'Categories' in separate tibbles and standard columns in each) from any data processing elements.

dataschema_extract(data_proc_elem)

Arguments

data_proc_elem: A tibble, identifying the input data processing elements.

Value

A list of tibble(s), 'Variables' and 'Categories' (if any), each of them being the two elements of the DataSchema.

Details

A data processing element contains the rules and metadata that will be used to perform harmonization of input datasets in accordance with the DataSchema. It must be a data-frame or data-frame extension (e.g. a tibble) and it must contain certain columns which participate to the process, including the dataschema_variable, ss-table,ss_variables, Mlstr_harmo::rule_category and Mlstr_harmo::algorithm. The mandatory first processing element must be ""id_creation"" in Mlstr_harmo::rule_category followed by the name of the column taken as identifier of each dataset to initiate the process of harmonization.

Examples

{

# You can use our demonstration files to run examples

dataschema_extract(
  data_proc_elem = DEMO_files_harmo$`data_processing_elements - final`)
}
#> $Variables
#> # A tibble: 13 × 3
#>    name           label          valueType
#>    <chr>          <chr>          <chr>    
#>  1 adm_unique_id  adm_unique_id  text     
#>  2 adm_study      adm_study      text     
#>  3 adm_year_dce   adm_year_dce   text     
#>  4 sdc_age        sdc_age        integer  
#>  5 sdc_gender     sdc_gender     integer  
#>  6 phy_height     phy_height     decimal  
#>  7 phy_weight     phy_weight     decimal  
#>  8 phy_bmi        phy_bmi        decimal  
#>  9 rep_preg_ever  rep_preg_ever  integer  
#> 10 rep_preg_curr  rep_preg_curr  integer  
#> 11 lsb_smo_ever   lsb_smo_ever   integer  
#> 12 lsb_smo_curr   lsb_smo_curr   integer  
#> 13 lsb_smo_status lsb_smo_status integer  
#> 
#> attr(,"madshapR::class")
#> [1] "data_dict_mlstr"
#> attr(,"harmonizR::class")
#> [1] "Dataschema_mlstr"