R/02-dictionaries_functions.R
as_data_dict.Rd
Checks if an object is a valid data dictionary and returns it with the
appropriate madshapR::class
attribute. This function mainly helps validate
inputs within other functions of the package but could be used to check if
an object is valid for use in a function. If either the columns 'typeof' or
'class' already exists in 'Variables', or 'na_values', 'labels' in
'Categories', the function will return the same data dictionary. Otherwise,
These columns will be added, using 'valueType' in 'Variables', and, 'label'
and 'missing' in 'Categories.
as_data_dict(object)
A potential data dictionary object to be coerced.
A list of data frame(s) with madshapR::class
'data_dict'.
A data dictionary contains the list of variables in a dataset and metadata
about the variables and can be associated with a dataset. A data dictionary
object is a list of data frame(s) named 'Variables' (required) and
'Categories' (if any). To be usable in any function, the data frame
'Variables' must contain at least the name
column, with all unique and
non-missing entries, and the data frame 'Categories' must contain at least
the variable
and name
columns, with unique combination of
variable
and name
.
For a better assessment, please use data_dict_evaluate()
.
{
library(dplyr)
# use madshapR_examples provided by the package
###### Example 1 : use the function to apply the attribute "data_dict" to the
# object.
data_dict <-
as_data_dict(madshapR_examples$`data_dictionary_example - as_data_dict`)
glimpse(data_dict)
###### Example 2 : use the function to shape the data dictionary formatted as
# data_dict_mlstr to data_dict object. The function mainly converts valueType
# column into corresponding typeof/class columns in 'Variables', and converts
# missing column into "na_values" column.
data_dict <- as_data_dict_mlstr(madshapR_examples$`data_dictionary_example`)
data_dict <- as_data_dict(data_dict)
glimpse(data_dict)
}
#> List of 2
#> $ Variables : tibble [9 × 9] (S3: tbl_df/tbl/data.frame)
#> ..$ index : int [1:9] 1 2 3 4 5 6 7 8 9
#> ..$ name : chr [1:9] "part_id" "gndr" "height" "weight_ms" ...
#> ..$ label:en : chr [1:9] "id of the participant" "gndr" "height" "weight_ms" ...
#> ..$ description:en : chr [1:9] "id of the participant" "gender of the participant" "height of the participant" "weight of the participant - measured" ...
#> ..$ typeof : chr [1:9] "character" "integer" "integer" "integer" ...
#> ..$ class : chr [1:9] NA NA NA NA ...
#> ..$ unit : chr [1:9] NA NA "cm" "kg" ...
#> ..$ datacollection::type : chr [1:9] "automatic" "declared" "declared" "measured" ...
#> ..$ datacollection::level: chr [1:9] "high" "high" "moderate" "moderate" ...
#> $ Categories: tibble [11 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ variable : chr [1:11] "gndr" "gndr" "gndr" "weight_ms" ...
#> ..$ name : chr [1:11] "-77" "1" "2" "-99" ...
#> ..$ labels : chr [1:11] "Don’t want to answer" "Male" "Female" "Don’t know" ...
#> ..$ na_values: chr [1:11] "Don’t want to answer" NA NA "Don’t know" ...
#> - attr(*, "madshapR::class")= chr "data_dict"
#> List of 2
#> $ Variables : tibble [9 × 9] (S3: tbl_df/tbl/data.frame)
#> ..$ index : int [1:9] 1 2 3 4 5 6 7 8 9
#> ..$ name : chr [1:9] "part_id" "gndr" "height" "weight_ms" ...
#> ..$ label:en : chr [1:9] "id of the participant" "gndr" "height" "weight_ms" ...
#> ..$ description:en : chr [1:9] "id of the participant" "gender of the participant" "height of the participant" "weight of the participant - measured" ...
#> ..$ typeof : chr [1:9] "character" "integer" "integer" "integer" ...
#> ..$ class : chr [1:9] NA NA NA NA ...
#> ..$ unit : chr [1:9] NA NA "cm" "kg" ...
#> ..$ datacollection::type : chr [1:9] "automatic" "declared" "declared" "measured" ...
#> ..$ datacollection::level: chr [1:9] "high" "high" "moderate" "moderate" ...
#> $ Categories: tibble [11 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ variable : chr [1:11] "gndr" "gndr" "gndr" "weight_ms" ...
#> ..$ name : chr [1:11] "1" "2" "-77" "-99" ...
#> ..$ labels : chr [1:11] "Male" "Female" "Don’t want to answer" "Don’t know" ...
#> ..$ na_values: chr [1:11] NA NA "Don’t want to answer" "Don’t know" ...
#> - attr(*, "madshapR::class")= chr "data_dict"