R/02-dictionaries_functions.R
data_dict_collapse.Rd
Collapses a data dictionary element (the parameter 'from'),
into column(s) in another element (the parameter 'to')
If the element 'to' exists, and contains any column 'xx' or 'yy', these
columns will be added to the element 'from' under the names 'to:xx'
and 'to:yy'. (unique names will be generated if necessary). Each element
of these column will gather all information to process the reverse operation.
Separator of each element is the following structure :
'name = xx1 ; name = xx2'.
This function is mainly used to collapse the 'Categories' element into
columns in 'Variables'.
This function is the reversed operation of data_dict_expand()
data_dict_collapse(
data_dict,
from = "Categories",
to = "Variables",
name_prefix = "Categories::"
)
A list of data frame(s) representing metadata to be transformed.
A symbol identifying the name of the element (data frame) to take column(s) from. Default is 'Categories'.
A symbol identifying the name of the element (data frame) to create column(s) to. Default is 'Variables'.
A character string of the prefix of columns of interest. This prefix will be used to select columns, and to rename them in the 'to' element. Default is 'Categories::'.
A list of data frame(s) identifying a data dictionary.
A data dictionary contains the list of variables in a dataset and metadata
about the variables and can be associated with a dataset. A data dictionary
object is a list of data frame(s) named 'Variables' (required) and
'Categories' (if any). To be usable in any function, the data frame
'Variables' must contain at least the name
column, with all unique and
non-missing entries, and the data frame 'Categories' must contain at least
the variable
and name
columns, with unique combination of
variable
and name
.
{
# use madshapR_DEMO provided by the package
data_dict <- madshapR_DEMO$data_dict_MELBOURNE
data_dict_collapse(data_dict)
}
#> $Variables
#> # A tibble: 6 × 8
#> index name `label:en` `description:en` valueType unit `Categories::label:en`
#> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 id id id of the parti… text NA NA
#> 2 2 Gend… Gender Gender integer NA "1 = Male ; \n2 = Fem…
#> 3 3 BMI BMI Body Mass Index decimal kg/m… NA
#> 4 4 age age Age of Particip… integer years "-888 = don't want to…
#> 5 5 smo_… smo_status Whether the par… integer NA "1 = never smoked ; \…
#> 6 6 prg_… prg_curr Are you current… integer NA "0 = not currently pr…
#> # ℹ 1 more variable: `Categories::missing` <chr>
#>
#> attr(,"madshapR::class")
#> [1] "data_dict_structure"