R/02-dictionaries_functions.R
data_dict_expand.Rd
Expands data dictionary column(s) in a element (the parameter 'from'),
into another element (the parameter 'to').
If the element from
contains any column starting with 'prefix', (xx,yy),
these columns will be added as 'xx' and 'yy' in the element identified by
to
. This data frame will be created if necessary, and columns will be
added, from left to right. (unique names will be generated if necessary).
Separator of each element is the following structure :
'name = xx1 ; name = xx2'.
This function is mainly used to expand the column(s) 'Categories::xx' in
"Variables" to "Categories" element with column(s) xx.
This function is the reversed operation of data_dict_collapse()
data_dict_expand(
data_dict,
from = "Variables",
name_prefix = "Categories::",
to = "Categories"
)
A list of data frame(s) representing metadata to be transformed.
A symbol identifying the name of the element (data frame) to take column(s) from. Default is 'Variables'.
Character string of the prefix of columns of interest. This prefix will be used to select columns, and to rename them in the 'to' element. Default is 'Categories::'.
A symbol identifying the name of the element (data frame) to create column(s) to. Default is 'Categories'.
A list of data frame(s) identifying a data dictionary.
A data dictionary contains the list of variables in a dataset and metadata
about the variables and can be associated with a dataset. A data dictionary
object is a list of data frame(s) named 'Variables' (required) and
'Categories' (if any). To be usable in any function, the data frame
'Variables' must contain at least the name
column, with all unique and
non-missing entries, and the data frame 'Categories' must contain at least
the variable
and name
columns, with unique combination of
variable
and name
.
{
library(dplyr)
# use madshapR_examples provided by the package
data_dict_collapsed <- madshapR_examples$`data_dictionary_example - collapsed`
data_dict_expanded <- data_dict_expand(data_dict_collapsed)
glimpse(data_dict_expand(data_dict_expanded))
}
#> Warning: Your data dictionary contains no column starting with 'Categories::' in Variables
#> List of 2
#> $ Variables : tibble [9 × 8] (S3: tbl_df/tbl/data.frame)
#> ..$ index : num [1:9] 1 2 3 4 5 6 7 8 9
#> ..$ name : chr [1:9] "part_id" "gndr" "height" "weight_ms" ...
#> ..$ label:en : chr [1:9] "id of the participant" "gndr" "height" "weight_ms" ...
#> ..$ description:en : chr [1:9] "id of the participant" "gender of the participant" "height of the participant" "weight of the participant - measured" ...
#> ..$ valueType : chr [1:9] "text" "integer" "integer" "integer" ...
#> ..$ unit : chr [1:9] NA NA "cm" "kg" ...
#> ..$ datacollection::type : chr [1:9] "automatic" "declared" "declared" "measured" ...
#> ..$ datacollection::level: chr [1:9] "high" "high" "moderate" "moderate" ...
#> $ Categories: tibble [11 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ variable: chr [1:11] "gndr" "gndr" "gndr" "weight_ms" ...
#> ..$ name : chr [1:11] "1" "2" "-77" "-88" ...
#> ..$ label:en: chr [1:11] "Male" "Female" "Don't want to answer" "Don't want to answer" ...
#> ..$ missing : chr [1:11] "FALSE" "FALSE" "TRUE" "TRUE" ...
#> - attr(*, "madshapR::class")= chr "data_dict_structure"