R/02-dictionaries_functions.R
data_dict_group_split.Rd
Divides data dictionary element(s) into the groups defined by the query.
This function divides both the 'Variables' and 'Categories' elements (if
the group exists under the same definition in in both) into a list of
data dictionaries, each with the rows of the associated group and all the
original columns, including grouping variables. This function is analogous
to running dplyr::group_by()
and dplyr::group_split()
. Each element is
named using the group values. data_dict_list_nest()
reverses the effect.
data_dict_group_split(data_dict, ...)
A list of data frame(s) representing metadata to be transformed.
Column in the data dictionary to split it by. If not provided, the splitting will be done on the grouping element of a grouped data dictionary.
A list of data frame(s) identifying a list of workable data dictionary structure.
A data dictionary contains the list of variables in a dataset and metadata
about the variables and can be associated with a dataset. A data dictionary
object is a list of data frame(s) named 'Variables' (required) and
'Categories' (if any). To be usable in any function, the data frame
'Variables' must contain at least the name
column, with all unique and
non-missing entries, and the data frame 'Categories' must contain at least
the variable
and name
columns, with unique combination of
variable
and name
.
{
# use madshapR_DEMO provided by the package
library(dplyr)
# Create a list of data dictionaries where the column 'table' is added to
# refer to the associated dataset. The object created is not a
# data dictionary per say, but can be used as a structure which can be
# shaped into a data dictionary.
data_dict_list <- list(
data_dict_1 <- madshapR_DEMO$data_dict_TOKYO ,
data_dict_2 <- madshapR_DEMO$data_dict_MELBOURNE)
names(data_dict_list) = c("dataset_TOKYO","dataset_MELBOURNE")
data_dict_nest <-
data_dict_list_nest(data_dict_list, name_group = 'table') %>%
data_dict_group_by(col = "table")
glimpse(data_dict_group_split(data_dict_nest,col = "table"))
}
#> List of 2
#> $ dataset_MELBOURNE:List of 2
#> ..$ Variables : tibble [6 × 7] (S3: tbl_df/tbl/data.frame)
#> ..$ Categories: tibble [12 × 5] (S3: tbl_df/tbl/data.frame)
#> $ dataset_TOKYO :List of 2
#> ..$ Variables : tibble [9 × 7] (S3: tbl_df/tbl/data.frame)
#> ..$ Categories: tibble [11 × 5] (S3: tbl_df/tbl/data.frame)