Checks if an object is a valid DataSchema and returns it with the appropriate Rmonize::class attribute. This function mainly helps validate inputs within other functions of the package but could be used separately to ensure that an object has an appropriate structure.

as_dataschema(object, as_dataschema_mlstr = FALSE)

Arguments

object

A potential DataSchema object to be coerced.

as_dataschema_mlstr

Whether the output DataSchema should be coerced with specific format restrictions for compatibility with other Maelstrom Research software. FALSE by default.

Value

A list of data frame(s) named 'Variables' and (if any) 'Categories', with Rmonize::class 'dataschema'.

Details

A DataSchema is the list of core variables to generate across datasets and related metadata. A DataSchema object is a list of data frames with elements named 'Variables' (required) and 'Categories' (if any). The 'Variables' element must contain at least the name column, and the 'Categories' element must contain at least the variable and name columns to be usable in any function. In 'Variables' the name column must also have unique entries, and in 'Categories' the combination of variable and name columns must also be unique.

The object may be specifically formatted to be compatible with additional Maelstrom Research software, in particular Opal environments.

Examples

{

# Use Rmonize_DEMO to run examples.
library(dplyr)

glimpse(as_dataschema(Rmonize_DEMO$`dataschema - final`))

}
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#> List of 2
#>  $ Variables : tibble [13 × 5] (S3: tbl_df/tbl/data.frame)
#>   ..$ name     : chr [1:13] "adm_unique_id" "adm_study" "adm_year_dce" "sdc_age" ...
#>   ..$ typeof   : chr [1:13] "character" "character" "character" "integer" ...
#>   ..$ index    : chr [1:13] "1" "2" "3" "4" ...
#>   ..$ label:en : chr [1:13] "Unique identification code of the participant." "Indicator of the survey study." "Indicator of the survey data collection event." "Participant's age at time of data collection event." ...
#>   ..$ valueType: chr [1:13] "text" "text" "text" "integer" ...
#>  $ Categories: tibble [16 × 5] (S3: tbl_df/tbl/data.frame)
#>   ..$ variable: chr [1:16] "adm_study" "adm_study" "adm_study" "sdc_sex" ...
#>   ..$ name    : chr [1:16] "MELBOURNE" "PARIS" "TOKYO" "1" ...
#>   ..$ labels  : chr [1:16] "MELBOURNE" "PARIS" "TOKYO" "1" ...
#>   ..$ label:en: chr [1:16] "MELBOURNE" "PARIS" "TOKYO" "Male" ...
#>   ..$ missing : chr [1:16] "0" "0" "0" "0" ...
#>  - attr(*, "madshapR::class")= chr "data_dict"
#>  - attr(*, "Rmonize::class")= chr "dataschema"