Transforms column(s) of a data dictionary from wide format to long format. If a taxonomy is provided, the corresponding columns in the data dictionary will be converted to a standardized format with fewer columns. This operation is equivalent to performing a tidyr::pivot_longer() on these columns following the taxonomy structure provided. Variable names in the data dictionary must be unique.

data_dict_pivot_longer(data_dict, taxonomy = NULL)

Arguments

data_dict

A list of data frame(s) representing metadata to be transformed.

taxonomy

An optional data frame identifying a variable classification schema.

Value

A list of data frame(s) identifying a data dictionary.

Details

A data dictionary contains the list of variables in a dataset and metadata about the variables and can be associated with a dataset. A data dictionary object is a list of data frame(s) named 'Variables' (required) and 'Categories' (if any). To be usable in any function, the data frame 'Variables' must contain at least the name column, with all unique and non-missing entries, and the data frame 'Categories' must contain at least the variable and name columns, with unique combination of variable and name.

A taxonomy is a classification schema that can be defined for variable attributes. A taxonomy is usually extracted from an Opal environment, and a taxonomy object is a data frame that must contain at least the columns taxonomy, vocabulary, and terms. Additional details about Opal taxonomies are available online.

Examples

{

# use madshapR_DEMO provided by the package

data_dict <- madshapR_DEMO$`data_dict_PARIS - collapsed`
taxonomy <- madshapR_DEMO$taxonomy_PARIS
data_dict_pivot_longer(data_dict,taxonomy)

}
#> $Variables
#> # A tibble: 7 × 11
#>   index name  `label:fr` `description:fr` valueType unit  `Categories::label:fr`
#>   <dbl> <chr> <chr>      <chr>            <chr>     <chr> <chr>                 
#> 1     1 ID    id         id du participa… text      NA     NA                   
#> 2     2 SEX   Sexe       Sexe du partici… integer   NA    "0 = Homme ; 1 = Femm…
#> 3     3 BMI   IMC        Indice de Masse… decimal   kg/m…  NA                   
#> 4     4 AGE   Age        Age du Particip… integer   years  NA                   
#> 5     5 SMO   Fumeur     Fumeur regulier  integer   NA    "0 = Non-fumeur _; 1 …
#> 6     6 SMO_… fum_qte_s… nombre de cigar… integer   ciga… "-8 = SKIP PATTERN"   
#> 7     7 PRG_… enceinte_… Si la personne … integer   NA    "0   _= Jamais encein…
#> # ℹ 4 more variables: `description::1` <chr>, `description::1.term` <chr>,
#> #   `description::2` <chr>, `description::2.term` <chr>
#>