R/02-dictionaries_functions.R
as_data_dict_mlstr.Rd
Validates the input object as a valid data dictionary compliant with formats
used in Maelstrom Research ecosystem, including Opal, and returns it with
the appropriate madshapR::class
attribute. This function mainly helps
validate input within other functions of the package but could be used to
check if an object is valid for use in a function.
as_data_dict_mlstr(object, name_standard = FALSE)
A potential valid data dictionary to be coerced.
Whether the input data dictionary has variable names compatible with Maelstrom Research ecosystem, including Opal)or not. FALSE by default.
A list of data frame(s) with madshapR::class
'data_dict_mlstr'.
A data dictionary contains the list of variables in a dataset and metadata
about the variables and can be associated with a dataset. A data dictionary
object is a list of data frame(s) named 'Variables' (required) and
'Categories' (if any). To be usable in any function, the data frame
'Variables' must contain at least the name
column, with all unique and
non-missing entries, and the data frame 'Categories' must contain at least
the variable
and name
columns, with unique combination of
variable
and name
.
The object may be specifically formatted to be compatible with additional Maelstrom Research software, in particular Opal environments.
For a better assessment, please use data_dict_evaluate()
.
{
library(dplyr)
###### Example 1 : use the function to apply the attribute "data_dict" to the
# object.
data_dict <-
as_data_dict_mlstr(madshapR_examples$`data_dictionary_example`)
glimpse(data_dict)
###### Example 2 : use the function to shape the data dictionary formatted as
# data_dict_mlstr to data_dict object. The function mainly converts valueType
# column into corresponding typeof/class columns in 'Variables', and converts
# missing column into "na_values" column.
data_dict <-
as_data_dict_mlstr(madshapR_examples$`data_dictionary_example - as_data_dict`)
glimpse(data_dict)
}
#> List of 2
#> $ Variables : tibble [9 × 8] (S3: tbl_df/tbl/data.frame)
#> ..$ index : int [1:9] 1 2 3 4 5 6 7 8 9
#> ..$ name : chr [1:9] "part_id" "gndr" "height" "weight_ms" ...
#> ..$ label:en : chr [1:9] "id of the participant" "gndr" "height" "weight_ms" ...
#> ..$ description:en : chr [1:9] "id of the participant" "gender of the participant" "height of the participant" "weight of the participant - measured" ...
#> ..$ valueType : chr [1:9] "text" "integer" "integer" "integer" ...
#> ..$ unit : chr [1:9] NA NA "cm" "kg" ...
#> ..$ datacollection::type : chr [1:9] "automatic" "declared" "declared" "measured" ...
#> ..$ datacollection::level: chr [1:9] "high" "high" "moderate" "moderate" ...
#> $ Categories: tibble [11 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ variable: chr [1:11] "gndr" "gndr" "gndr" "weight_ms" ...
#> ..$ name : chr [1:11] "1" "2" "-77" "-99" ...
#> ..$ label:en: chr [1:11] "Male" "Female" "Don’t want to answer" "Don’t know" ...
#> ..$ missing : logi [1:11] FALSE FALSE TRUE TRUE TRUE FALSE ...
#> - attr(*, "madshapR::class")= chr "data_dict_mlstr"
#> List of 2
#> $ Variables : tibble [9 × 8] (S3: tbl_df/tbl/data.frame)
#> ..$ index : int [1:9] 1 2 3 4 5 6 7 8 9
#> ..$ name : chr [1:9] "part_id" "gndr" "height" "weight_ms" ...
#> ..$ label:en : chr [1:9] "id of the participant" "gndr" "height" "weight_ms" ...
#> ..$ description:en : chr [1:9] "id of the participant" "gender of the participant" "height of the participant" "weight of the participant - measured" ...
#> ..$ valueType : chr [1:9] "text" "integer" "integer" "integer" ...
#> ..$ unit : chr [1:9] NA NA "cm" "kg" ...
#> ..$ datacollection::type : chr [1:9] "automatic" "declared" "declared" "measured" ...
#> ..$ datacollection::level: chr [1:9] "high" "high" "moderate" "moderate" ...
#> $ Categories: tibble [11 × 4] (S3: tbl_df/tbl/data.frame)
#> ..$ variable: chr [1:11] "gndr" "gndr" "gndr" "weight_ms" ...
#> ..$ name : chr [1:11] "1" "2" "-77" "-99" ...
#> ..$ label:en: chr [1:11] "Male" "Female" "Don’t want to answer" "Don’t know" ...
#> ..$ missing : logi [1:11] FALSE FALSE TRUE TRUE TRUE FALSE ...
#> - attr(*, "madshapR::class")= chr "data_dict_mlstr"