Collapses a data dictionary element (the parameter 'from'), into column(s) in another element (the parameter 'to') If the element 'to' exists, and contains any column 'xx' or 'yy', these columns will be added to the element 'from' under the names 'to:xx' and 'to:yy'. (unique names will be generated if necessary). Each element of these column will gather all information to process the reverse operation. Separator of each element is the following structure : 'name = xx1 ; name = xx2'. This function is mainly used to collapse the 'Categories' element into columns in 'Variables'. This function is the reversed operation of data_dict_expand()

data_dict_collapse(
  data_dict,
  from = "Categories",
  to = "Variables",
  name_prefix = "Categories::"
)

Arguments

data_dict

A list of data frame(s) representing metadata to be transformed.

from

A symbol identifying the name of the element (data frame) to take column(s) from. Default is 'Categories'.

to

A symbol identifying the name of the element (data frame) to create column(s) to. Default is 'Variables'.

name_prefix

A character string of the prefix of columns of interest. This prefix will be used to select columns, and to rename them in the 'to' element. Default is 'Categories::'.

Value

A list of data frame(s) identifying a data dictionary.

Details

A data dictionary contains the list of variables in a dataset and metadata about the variables and can be associated with a dataset. A data dictionary object is a list of data frame(s) named 'Variables' (required) and 'Categories' (if any). To be usable in any function, the data frame 'Variables' must contain at least the name column, with all unique and non-missing entries, and the data frame 'Categories' must contain at least the variable and name columns, with unique combination of variable and name.

Examples

{

# use madshapR_DEMO provided by the package

data_dict <- madshapR_DEMO$data_dict_MELBOURNE
data_dict_collapse(data_dict)

}
#> $Variables
#> # A tibble: 6 × 8
#>   index name  `label:en` `description:en` valueType unit  `Categories::label:en`
#>   <dbl> <chr> <chr>      <chr>            <chr>     <chr> <chr>                 
#> 1     1 id    id         id of the parti… text      NA     NA                   
#> 2     2 Gend… Gender     Gender           integer   NA    "1 = Male ; \n2 = Fem…
#> 3     3 BMI   BMI        Body Mass Index  decimal   kg/m…  NA                   
#> 4     4 age   age        Age of Particip… integer   years "-888 = don't want to…
#> 5     5 smo_… smo_status Whether the par… integer   NA    "1 = never smoked ; \…
#> 6     6 prg_… prg_curr   Are you current… integer   NA    "0 = not currently pr…
#> # ℹ 1 more variable: `Categories::missing` <chr>
#> 
#> attr(,"madshapR::class")
#> [1] "data_dict_structure"