Expands data dictionary column(s) in a element (the parameter 'from'), into another element (the parameter 'to'). If the element from contains any column starting with 'prefix', (xx,yy), these columns will be added as 'xx' and 'yy' in the element identified by to. This data frame will be created if necessary, and columns will be added, from left to right. (unique names will be generated if necessary). Separator of each element is the following structure : 'name = xx1 ; name = xx2'. This function is mainly used to expand the column(s) 'Categories::xx' in "Variables" to "Categories" element with column(s) xx. This function is the reversed operation of data_dict_collapse()

data_dict_expand(
  data_dict,
  from = "Variables",
  name_prefix = "Categories::",
  to = "Categories"
)

Arguments

data_dict

A list of data frame(s) representing metadata to be transformed.

from

A symbol identifying the name of the element (data frame) to take column(s) from. Default is 'Variables'.

name_prefix

Character string of the prefix of columns of interest. This prefix will be used to select columns, and to rename them in the 'to' element. Default is 'Categories::'.

to

A symbol identifying the name of the element (data frame) to create column(s) to. Default is 'Categories'.

Value

A list of data frame(s) identifying a data dictionary.

Details

A data dictionary contains the list of variables in a dataset and metadata about the variables and can be associated with a dataset. A data dictionary object is a list of data frame(s) named 'Variables' (required) and 'Categories' (if any). To be usable in any function, the data frame 'Variables' must contain at least the name column, with all unique and non-missing entries, and the data frame 'Categories' must contain at least the variable and name columns, with unique combination of variable and name.

Examples

{

# use madshapR_DEMO provided by the package

data_dict <- madshapR_DEMO$`data_dict_PARIS - collapsed`
data_dict_expand(data_dict)

}
#> $Variables
#> # A tibble: 7 × 8
#>   index name     `label:fr` `description:fr` valueType unit  `description::type`
#>   <dbl> <chr>    <chr>      <chr>            <chr>     <chr> <chr>              
#> 1     1 ID       id         id du participa… text      NA    other              
#> 2     2 SEX      Sexe       Sexe du partici… integer   NA    natural            
#> 3     3 BMI      IMC        Indice de Masse… decimal   kg/m… real               
#> 4     4 AGE      Age        Age du Particip… integer   years natural            
#> 5     5 SMO      Fumeur     Fumeur regulier  integer   NA    natural            
#> 6     6 SMO_QTY  fum_qte_s… nombre de cigar… integer   ciga… natural            
#> 7     7 PRG_EVER enceinte_… Si la personne … integer   NA    natural            
#> # ℹ 1 more variable: `description::level` <chr>
#> 
#> $Categories
#> # A tibble: 9 × 3
#>   name  variable `label:fr`                                    
#>   <chr> <chr>    <chr>                                         
#> 1 0     SEX      Homme                                         
#> 2 1     SEX      Femme                                         
#> 3 0     SMO      Non-fumeur                                    
#> 4 1     SMO      Fumeur (cigarette ; cigare)                   
#> 5 -8    SMO_QTY  SKIP PATTERN                                  
#> 6 0     PRG_EVER Jamais enceinte (nb grossesse = 0)            
#> 7 1     PRG_EVER Enceinte au moins une fois (nb grossesse => 1)
#> 8 9     PRG_EVER NSP                                           
#> 9 -8    PRG_EVER SKIP PATTERN                                  
#> 
#> attr(,"madshapR::class")
#> [1] "data_dict_structure"