Collapses a data dictionary element (the parameter 'from'), into column(s) in another element (the parameter 'to') If the element 'to' exists, and contains any column 'xx' or 'yy', these columns will be added to the element 'from' under the names 'to:xx' and 'to:yy'. (unique names will be generated if necessary). Each element of these column will gather all information to process the reverse operation. Separator of each element is the following structure : 'name = xx1 ; name = xx2'. This function is mainly used to collapse the 'Categories' element into columns in 'Variables'. This function is the reversed operation of data_dict_expand()

data_dict_collapse(
  data_dict,
  from = "Categories",
  to = "Variables",
  name_prefix = "Categories::"
)

Arguments

data_dict

A list of data frame(s) representing metadata to be transformed.

from

A symbol identifying the name of the element (data frame) to take column(s) from. Default is 'Categories'.

to

A symbol identifying the name of the element (data frame) to create column(s) to. Default is 'Variables'.

name_prefix

A character string of the prefix of columns of interest. This prefix will be used to select columns, and to rename them in the 'to' element. Default is 'Categories::'.

Value

A list of data frame(s) identifying a data dictionary.

Details

A data dictionary contains the list of variables in a dataset and metadata about the variables and can be associated with a dataset. A data dictionary object is a list of data frame(s) named 'Variables' (required) and 'Categories' (if any). To be usable in any function, the data frame 'Variables' must contain at least the name column, with all unique and non-missing entries, and the data frame 'Categories' must contain at least the variable and name columns, with unique combination of variable and name.

Examples

{

# use madshapR_examples provided by the package
data_dict <- madshapR_examples$`data_dictionary_example`
data_dict_collapsed <- data_dict_collapse(data_dict)
head(data_dict_collapse(data_dict_collapsed))

}
#> Warning: Your data dictionary contains no 'Categories' element.
#> $Variables
#> # A tibble: 9 × 10
#>   index name  `label:en` `description:en` valueType unit  `datacollection::type`
#>   <dbl> <chr> <chr>      <chr>            <chr>     <chr> <chr>                 
#> 1     1 part… id of the… id of the parti… text      NA    automatic             
#> 2     2 gndr  gndr       gender of the p… integer   NA    declared              
#> 3     3 heig… height     height of the p… integer   cm    declared              
#> 4     4 weig… weight_ms  weight of the p… integer   kg    measured              
#> 5     5 weig… weight_dc  weight of the p… decimal   kg    measured              
#> 6     6 dob   dob        date of birth o… date      years declared              
#> 7     7 prg_… prg_ever   whether the par… integer   NA    declared              
#> 8     8 empty empty      empty column     integer   NA    automatic             
#> 9     9 open… opentext   open text        text      NA    automatic             
#> # ℹ 3 more variables: `datacollection::level` <chr>,
#> #   `Categories::label:en` <chr>, `Categories::missing` <chr>
#>