Groups the data dictionary element(s) by the groups defined by the query. This function groups both the 'Variables' and 'Categories' elements (if the group exists under the same definition in in both). This function is analogous to running dplyr::group_by(). Each element is named using the group values. data_dict_ungroup() reverses the effect.

data_dict_group_by(data_dict, col)

Arguments

data_dict

A list of data frame(s) representing metadata to be transformed.

col

variable to group by.

Value

A list of data frame(s) identifying a workable data dictionary structure.

Details

A data dictionary contains the list of variables in a dataset and metadata about the variables and can be associated with a dataset. A data dictionary object is a list of data frame(s) named 'Variables' (required) and 'Categories' (if any). To be usable in any function, the data frame 'Variables' must contain at least the name column, with all unique and non-missing entries, and the data frame 'Categories' must contain at least the variable and name columns, with unique combination of variable and name.

Examples

{

library(dplyr)

# use madshapR_examples provided by the package
# Create a list of data dictionaries where the column 'table' is added to 
# refer to the associated dataset. The object created is not a 
# data dictionary per say, but can be used as a structure which can be 
# shaped into a data dictionary.

data_dict_list <- list(
  data_dict_1 = madshapR_examples$`data_dictionary_example` ,
  data_dict_2 = madshapR_examples$`data_dictionary_example - collapsed`)

data_dict_ns <- 
  data_dict_list_nest(data_dict_list, name_group = "table")

data_dict_gp <- data_dict_group_by(data_dict_ns, col = "table")
glimpse(data_dict_gp)

}
#> List of 2
#>  $ Variables : gropd_df [18 × 11] (S3: grouped_df/tbl_df/tbl/data.frame)
#>   ..$ table                : chr [1:18] "data_dict_1" "data_dict_1" "data_dict_1" "data_dict_1" ...
#>   ..$ index                : chr [1:18] "1" "2" "3" "4" ...
#>   ..$ name                 : chr [1:18] "part_id" "gndr" "height" "weight_ms" ...
#>   ..$ label:en             : chr [1:18] "id of the participant" "gndr" "height" "weight_ms" ...
#>   ..$ description:en       : chr [1:18] "id of the participant" "gender of the participant" "height of the participant" "weight of the participant - measured" ...
#>   ..$ valueType            : chr [1:18] "text" "integer" "integer" "integer" ...
#>   ..$ unit                 : chr [1:18] NA NA "cm" "kg" ...
#>   ..$ datacollection::type : chr [1:18] "automatic" "declared" "declared" "measured" ...
#>   ..$ datacollection::level: chr [1:18] "high" "high" "moderate" "moderate" ...
#>   ..$ Categories::label:en : chr [1:18] NA NA NA NA ...
#>   ..$ Categories::missing  : chr [1:18] NA NA NA NA ...
#>   ..- attr(*, "groups")= tibble [2 × 2] (S3: tbl_df/tbl/data.frame)
#>   .. ..- attr(*, ".drop")= logi TRUE
#>  $ Categories: gropd_df [11 × 5] (S3: grouped_df/tbl_df/tbl/data.frame)
#>   ..$ table   : chr [1:11] "data_dict_1" "data_dict_1" "data_dict_1" "data_dict_1" ...
#>   ..$ variable: chr [1:11] "gndr" "gndr" "gndr" "weight_ms" ...
#>   ..$ name    : chr [1:11] "1" "2" "-77" "-88" ...
#>   ..$ label:en: chr [1:11] "Male" "Female" "Don’t want to answer" "Don’t want to answer" ...
#>   ..$ missing : chr [1:11] "FALSE" "FALSE" "TRUE" "TRUE" ...
#>   ..- attr(*, "groups")= tibble [1 × 2] (S3: tbl_df/tbl/data.frame)
#>   .. ..- attr(*, ".drop")= logi TRUE